Overview

Dataset statistics

Number of variables50
Number of observations499
Missing cells6257
Missing cells (%)25.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory134.4 KiB
Average record size in memory275.8 B

Variable types

Numeric14
Categorical33
Unsupported3

Alerts

ethic_appr has a high cardinality: 498 distinct values High cardinality
study_1_conc has a high cardinality: 253 distinct values High cardinality
study_1_add_info has a high cardinality: 90 distinct values High cardinality
study_2_conc has a high cardinality: 274 distinct values High cardinality
study_2_add_info has a high cardinality: 91 distinct values High cardinality
study_3_conc has a high cardinality: 227 distinct values High cardinality
study_3_add_info has a high cardinality: 76 distinct values High cardinality
study_4_conc has a high cardinality: 253 distinct values High cardinality
study_4_add_info has a high cardinality: 87 distinct values High cardinality
design_add_fac has a high cardinality: 192 distinct values High cardinality
rank_add_fac_1 has a high cardinality: 67 distinct values High cardinality
df_index is highly correlated with study_1_add_info and 4 other fieldsHigh correlation
age is highly correlated with study_1_add_info and 5 other fieldsHigh correlation
rank_sci_repro is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
rank_resp is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
rank_just is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
rank_anony is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
rank_harms is highly correlated with study_1_add_info and 5 other fieldsHigh correlation
rank_balance is highly correlated with study_1_add_info and 8 other fieldsHigh correlation
rank_pub_interst is highly correlated with study_1_add_info and 3 other fieldsHigh correlation
rank_add_fac_1_pos is highly correlated with study_1_add_info and 7 other fieldsHigh correlation
rank_add_fac_2_pos is highly correlated with politic_views and 8 other fieldsHigh correlation
rank_add_fac_3_pos is highly correlated with study_1_add_info and 7 other fieldsHigh correlation
aware_sm_advan_score is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
aware_sm_use_score is highly correlated with study_1_add_info and 2 other fieldsHigh correlation
sm_use is highly correlated with study_1_add_info and 5 other fieldsHigh correlation
gender_id is highly correlated with ethnic_id and 1 other fieldsHigh correlation
ethnic_id is highly correlated with gender_id and 6 other fieldsHigh correlation
edu is highly correlated with study_1_add_info and 4 other fieldsHigh correlation
politic_views is highly correlated with ethnic_id and 5 other fieldsHigh correlation
aware_sm_res is highly correlated with study_1_add_info and 4 other fieldsHigh correlation
study_1_ethic_acc is highly correlated with study_1_add_info and 8 other fieldsHigh correlation
study_1_add_info is highly correlated with df_index and 35 other fieldsHigh correlation
study_2_ethic_acc is highly correlated with study_1_ethic_acc and 8 other fieldsHigh correlation
study_2_add_info is highly correlated with df_index and 28 other fieldsHigh correlation
study_3_ethic_acc is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
study_3_add_info is highly correlated with df_index and 36 other fieldsHigh correlation
study_4_ethic_acc is highly correlated with study_1_ethic_acc and 8 other fieldsHigh correlation
study_4_add_info is highly correlated with df_index and 38 other fieldsHigh correlation
design_cont is highly correlated with study_1_add_info and 13 other fieldsHigh correlation
design_num_users is highly correlated with study_1_add_info and 12 other fieldsHigh correlation
design_res_purp is highly correlated with study_1_add_info and 11 other fieldsHigh correlation
design_len_data is highly correlated with study_2_add_info and 11 other fieldsHigh correlation
design_admin_inter is highly correlated with study_2_add_info and 11 other fieldsHigh correlation
design_inter_type is highly correlated with study_1_add_info and 9 other fieldsHigh correlation
design_partic_aware is highly correlated with study_1_add_info and 6 other fieldsHigh correlation
design_inter_impact is highly correlated with study_1_add_info and 12 other fieldsHigh correlation
design_type_data is highly correlated with study_1_add_info and 10 other fieldsHigh correlation
rank_add_fac_1 is highly correlated with sm_use and 27 other fieldsHigh correlation
rank_add_fac_2 is highly correlated with sm_use and 27 other fieldsHigh correlation
rank_add_fac_3 is highly correlated with df_index and 29 other fieldsHigh correlation
aware_sm_interact_score is highly correlated with study_1_add_info and 4 other fieldsHigh correlation
study_1_ethic_acc has 157 (31.5%) missing values Missing
study_1_conc has 242 (48.5%) missing values Missing
study_1_add_info has 407 (81.6%) missing values Missing
study_2_ethic_acc has 173 (34.7%) missing values Missing
study_2_conc has 225 (45.1%) missing values Missing
study_2_add_info has 405 (81.2%) missing values Missing
study_3_ethic_acc has 270 (54.1%) missing values Missing
study_3_conc has 271 (54.3%) missing values Missing
study_3_add_info has 423 (84.8%) missing values Missing
study_4_ethic_acc has 180 (36.1%) missing values Missing
study_4_conc has 246 (49.3%) missing values Missing
study_4_add_info has 412 (82.6%) missing values Missing
design_add_fac has 299 (59.9%) missing values Missing
rank_add_fac_1 has 418 (83.8%) missing values Missing
rank_add_fac_1_pos has 350 (70.1%) missing values Missing
rank_add_fac_2 has 473 (94.8%) missing values Missing
rank_add_fac_2_pos has 412 (82.6%) missing values Missing
rank_add_fac_3 has 477 (95.6%) missing values Missing
rank_add_fac_3_pos has 417 (83.6%) missing values Missing
df_index is uniformly distributed Uniform
ethic_appr is uniformly distributed Uniform
study_1_conc is uniformly distributed Uniform
study_1_add_info is uniformly distributed Uniform
study_2_conc is uniformly distributed Uniform
study_2_add_info is uniformly distributed Uniform
study_3_conc is uniformly distributed Uniform
study_3_add_info is uniformly distributed Uniform
study_4_conc is uniformly distributed Uniform
study_4_add_info is uniformly distributed Uniform
design_add_fac is uniformly distributed Uniform
df_index has unique values Unique
aware_sm_advan is an unsupported type, check if it needs cleaning or further analysis Unsupported
aware_sm_interact is an unsupported type, check if it needs cleaning or further analysis Unsupported
aware_sm_use is an unsupported type, check if it needs cleaning or further analysis Unsupported
rank_add_fac_1_pos has 12 (2.4%) zeros Zeros
rank_add_fac_2_pos has 11 (2.2%) zeros Zeros
rank_add_fac_3_pos has 11 (2.2%) zeros Zeros
aware_sm_advan_score has 47 (9.4%) zeros Zeros
aware_sm_use_score has 5 (1.0%) zeros Zeros

Reproduction

Analysis started2022-11-21 12:14:43.854181
Analysis finished2022-11-21 12:15:01.635959
Duration17.78 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct499
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean250
Minimum1
Maximum499
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:01.678641image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile25.9
Q1125.5
median250
Q3374.5
95-th percentile474.1
Maximum499
Range498
Interquartile range (IQR)249

Descriptive statistics

Standard deviation144.1931575
Coefficient of variation (CV)0.57677263
Kurtosis-1.2
Mean250
Median Absolute Deviation (MAD)125
Skewness0
Sum124750
Variance20791.66667
MonotonicityStrictly increasing
2022-11-21T12:15:01.747214image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.2%
3291
 
0.2%
3421
 
0.2%
3411
 
0.2%
3401
 
0.2%
3391
 
0.2%
3381
 
0.2%
3371
 
0.2%
3361
 
0.2%
3351
 
0.2%
Other values (489)489
98.0%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
41
0.2%
51
0.2%
61
0.2%
71
0.2%
81
0.2%
91
0.2%
101
0.2%
ValueCountFrequency (%)
4991
0.2%
4981
0.2%
4971
0.2%
4961
0.2%
4951
0.2%
4941
0.2%
4931
0.2%
4921
0.2%
4911
0.2%
4901
0.2%

sm_use
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size759.0 B
Facebook
258 
Reddit
133 
Twitter
108 

Length

Max length8
Median length8
Mean length7.250501002
Min length6

Characters and Unicode

Total characters3618
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFacebook
2nd rowTwitter
3rd rowFacebook
4th rowFacebook
5th rowTwitter

Common Values

ValueCountFrequency (%)
Facebook258
51.7%
Reddit133
26.7%
Twitter108
21.6%

Length

2022-11-21T12:15:01.807916image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:01.864350image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
facebook258
51.7%
reddit133
26.7%
twitter108
21.6%

Most occurring characters

ValueCountFrequency (%)
o516
14.3%
e499
13.8%
t349
9.6%
d266
7.4%
F258
7.1%
a258
7.1%
c258
7.1%
b258
7.1%
k258
7.1%
i241
6.7%
Other values (4)457
12.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3119
86.2%
Uppercase Letter499
 
13.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o516
16.5%
e499
16.0%
t349
11.2%
d266
8.5%
a258
8.3%
c258
8.3%
b258
8.3%
k258
8.3%
i241
7.7%
w108
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
F258
51.7%
R133
26.7%
T108
21.6%

Most occurring scripts

ValueCountFrequency (%)
Latin3618
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o516
14.3%
e499
13.8%
t349
9.6%
d266
7.4%
F258
7.1%
a258
7.1%
c258
7.1%
b258
7.1%
k258
7.1%
i241
6.7%
Other values (4)457
12.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3618
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o516
14.3%
e499
13.8%
t349
9.6%
d266
7.4%
F258
7.1%
a258
7.1%
c258
7.1%
b258
7.1%
k258
7.1%
i241
6.7%
Other values (4)457
12.6%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct60
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.66332665
Minimum18
Maximum78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:01.922011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile23
Q131
median39
Q351.5
95-th percentile67
Maximum78
Range60
Interquartile range (IQR)20.5

Descriptive statistics

Standard deviation13.63593166
Coefficient of variation (CV)0.3272885954
Kurtosis-0.5585557113
Mean41.66332665
Median Absolute Deviation (MAD)10
Skewness0.5655176939
Sum20790
Variance185.9386323
MonotonicityNot monotonic
2022-11-21T12:15:02.236494image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3522
 
4.4%
3420
 
4.0%
3719
 
3.8%
2918
 
3.6%
2718
 
3.6%
2617
 
3.4%
4415
 
3.0%
3115
 
3.0%
3815
 
3.0%
2314
 
2.8%
Other values (50)326
65.3%
ValueCountFrequency (%)
181
 
0.2%
194
 
0.8%
203
 
0.6%
212
 
0.4%
222
 
0.4%
2314
2.8%
248
1.6%
2510
2.0%
2617
3.4%
2718
3.6%
ValueCountFrequency (%)
781
 
0.2%
763
0.6%
751
 
0.2%
741
 
0.2%
732
 
0.4%
721
 
0.2%
712
 
0.4%
706
1.2%
693
0.6%
682
 
0.4%

gender_id
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size831.0 B
Male
282 
Female
207 
Non-binary / third gender
 
8
Prefer not to say
 
2

Length

Max length25
Median length4
Mean length5.218436874
Min length4

Characters and Unicode

Total characters2604
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowMale
3rd rowFemale
4th rowFemale
5th rowFemale

Common Values

ValueCountFrequency (%)
Male282
56.5%
Female207
41.5%
Non-binary / third gender8
 
1.6%
Prefer not to say2
 
0.4%

Length

2022-11-21T12:15:02.293501image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:02.348190image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
male282
53.3%
female207
39.1%
non-binary8
 
1.5%
8
 
1.5%
third8
 
1.5%
gender8
 
1.5%
prefer2
 
0.4%
not2
 
0.4%
to2
 
0.4%
say2
 
0.4%

Most occurring characters

ValueCountFrequency (%)
e716
27.5%
a499
19.2%
l489
18.8%
M282
 
10.8%
F207
 
7.9%
m207
 
7.9%
30
 
1.2%
r28
 
1.1%
n26
 
1.0%
d16
 
0.6%
Other values (13)104
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2059
79.1%
Uppercase Letter499
 
19.2%
Space Separator30
 
1.2%
Dash Punctuation8
 
0.3%
Other Punctuation8
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e716
34.8%
a499
24.2%
l489
23.7%
m207
 
10.1%
r28
 
1.4%
n26
 
1.3%
d16
 
0.8%
i16
 
0.8%
o12
 
0.6%
t12
 
0.6%
Other values (6)38
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
M282
56.5%
F207
41.5%
N8
 
1.6%
P2
 
0.4%
Space Separator
ValueCountFrequency (%)
30
100.0%
Dash Punctuation
ValueCountFrequency (%)
-8
100.0%
Other Punctuation
ValueCountFrequency (%)
/8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2558
98.2%
Common46
 
1.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e716
28.0%
a499
19.5%
l489
19.1%
M282
 
11.0%
F207
 
8.1%
m207
 
8.1%
r28
 
1.1%
n26
 
1.0%
d16
 
0.6%
i16
 
0.6%
Other values (10)72
 
2.8%
Common
ValueCountFrequency (%)
30
65.2%
-8
 
17.4%
/8
 
17.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII2604
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e716
27.5%
a499
19.2%
l489
18.8%
M282
 
10.8%
F207
 
7.9%
m207
 
7.9%
30
 
1.2%
r28
 
1.1%
n26
 
1.0%
d16
 
0.6%
Other values (13)104
 
4.0%

ethnic_id
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size1023.0 B
White / Caucasian
397 
African-American
 
32
Mixed race
 
20
Hispanic
 
19
Asian - Eastern
 
16
Other values (7)
 
15

Length

Max length17
Median length17
Mean length16.15230461
Min length5

Characters and Unicode

Total characters8060
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)1.0%

Sample

1st rowAsian - Eastern
2nd rowMixed race
3rd rowPacific Islander
4th rowWhite / Caucasian
5th rowNative-American

Common Values

ValueCountFrequency (%)
White / Caucasian397
79.6%
African-American32
 
6.4%
Mixed race20
 
4.0%
Hispanic19
 
3.8%
Asian - Eastern16
 
3.2%
Asian - Indian7
 
1.4%
Native-American3
 
0.6%
Asian - Southeast1
 
0.2%
Carribean1
 
0.2%
Other1
 
0.2%
Other values (2)2
 
0.4%

Length

2022-11-21T12:15:02.397914image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
421
30.8%
white397
29.1%
caucasian397
29.1%
african-american32
 
2.3%
asian24
 
1.8%
mixed20
 
1.5%
race20
 
1.5%
hispanic19
 
1.4%
eastern16
 
1.2%
indian7
 
0.5%
Other values (10)12
 
0.9%

Most occurring characters

ValueCountFrequency (%)
a1353
16.8%
i956
11.9%
866
10.7%
n540
 
6.7%
c505
 
6.3%
e497
 
6.2%
s459
 
5.7%
t421
 
5.2%
h399
 
5.0%
C398
 
4.9%
Other values (24)1666
20.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5782
71.7%
Uppercase Letter956
 
11.9%
Space Separator866
 
10.7%
Other Punctuation397
 
4.9%
Dash Punctuation59
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1353
23.4%
i956
16.5%
n540
 
9.3%
c505
 
8.7%
e497
 
8.6%
s459
 
7.9%
t421
 
7.3%
h399
 
6.9%
u398
 
6.9%
r109
 
1.9%
Other values (10)145
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
C398
41.6%
W397
41.5%
A91
 
9.5%
M20
 
2.1%
H19
 
2.0%
E16
 
1.7%
I8
 
0.8%
N3
 
0.3%
P2
 
0.2%
S1
 
0.1%
Space Separator
ValueCountFrequency (%)
866
100.0%
Other Punctuation
ValueCountFrequency (%)
/397
100.0%
Dash Punctuation
ValueCountFrequency (%)
-59
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6738
83.6%
Common1322
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1353
20.1%
i956
14.2%
n540
 
8.0%
c505
 
7.5%
e497
 
7.4%
s459
 
6.8%
t421
 
6.2%
h399
 
5.9%
C398
 
5.9%
u398
 
5.9%
Other values (21)812
12.1%
Common
ValueCountFrequency (%)
866
65.5%
/397
30.0%
-59
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII8060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1353
16.8%
i956
11.9%
866
10.7%
n540
 
6.7%
c505
 
6.3%
e497
 
6.2%
s459
 
5.7%
t421
 
5.2%
h399
 
5.0%
C398
 
4.9%
Other values (24)1666
20.7%

edu
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size983.0 B
Bachelor's degree
222 
Highschool
153 
Master's degree or above
87 
Associate's degree
 
22
Some college
 
7
Other values (2)
 
8

Length

Max length24
Median length19
Mean length16.06412826
Min length10

Characters and Unicode

Total characters8016
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHighschool
2nd rowHighschool
3rd rowBachelor's degree
4th rowHighschool
5th rowHighschool

Common Values

ValueCountFrequency (%)
Bachelor's degree222
44.5%
Highschool153
30.7%
Master's degree or above87
 
17.4%
Associate's degree22
 
4.4%
Some college7
 
1.4%
Prefer not to say4
 
0.8%
Vocational training4
 
0.8%

Length

2022-11-21T12:15:02.452410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:02.513553image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
degree331
32.2%
bachelor's222
21.6%
highschool153
14.9%
master's87
 
8.5%
or87
 
8.5%
above87
 
8.5%
associate's22
 
2.1%
some7
 
0.7%
college7
 
0.7%
prefer4
 
0.4%
Other values (5)20
 
1.9%

Most occurring characters

ValueCountFrequency (%)
e1440
18.0%
o754
9.4%
r739
9.2%
s619
 
7.7%
h528
 
6.6%
528
 
6.6%
g495
 
6.2%
a434
 
5.4%
c408
 
5.1%
l393
 
4.9%
Other values (17)1678
20.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6658
83.1%
Space Separator528
 
6.6%
Uppercase Letter499
 
6.2%
Other Punctuation331
 
4.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1440
21.6%
o754
11.3%
r739
11.1%
s619
9.3%
h528
 
7.9%
g495
 
7.4%
a434
 
6.5%
c408
 
6.1%
l393
 
5.9%
d331
 
5.0%
Other values (8)517
 
7.8%
Uppercase Letter
ValueCountFrequency (%)
B222
44.5%
H153
30.7%
M87
 
17.4%
A22
 
4.4%
S7
 
1.4%
P4
 
0.8%
V4
 
0.8%
Space Separator
ValueCountFrequency (%)
528
100.0%
Other Punctuation
ValueCountFrequency (%)
'331
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7157
89.3%
Common859
 
10.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1440
20.1%
o754
10.5%
r739
10.3%
s619
8.6%
h528
 
7.4%
g495
 
6.9%
a434
 
6.1%
c408
 
5.7%
l393
 
5.5%
d331
 
4.6%
Other values (15)1016
14.2%
Common
ValueCountFrequency (%)
528
61.5%
'331
38.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII8016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1440
18.0%
o754
9.4%
r739
9.2%
s619
 
7.7%
h528
 
6.6%
528
 
6.6%
g495
 
6.2%
a434
 
5.4%
c408
 
5.1%
l393
 
4.9%
Other values (17)1678
20.9%

politic_views
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size847.0 B
Very liberal
150 
Slightly liberal
126 
Slightly conservative
96 
Neutral/ Neither conservative or liberal
89 
Very conservative
35 

Length

Max length40
Median length21
Mean length20.11623246
Min length12

Characters and Unicode

Total characters10038
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSlightly liberal
2nd rowNeutral/ Neither conservative or liberal
3rd rowVery liberal
4th rowSlightly conservative
5th rowVery liberal

Common Values

ValueCountFrequency (%)
Very liberal150
30.1%
Slightly liberal126
25.3%
Slightly conservative96
19.2%
Neutral/ Neither conservative or liberal89
17.8%
Very conservative35
 
7.0%
Prefer not to say3
 
0.6%

Length

2022-11-21T12:15:02.571464image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:02.629769image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
liberal365
28.7%
slightly222
17.5%
conservative220
17.3%
very185
14.6%
neutral89
 
7.0%
neither89
 
7.0%
or89
 
7.0%
prefer3
 
0.2%
not3
 
0.2%
to3
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e1263
12.6%
l1263
12.6%
r1043
10.4%
i896
 
8.9%
772
 
7.7%
a677
 
6.7%
t626
 
6.2%
v440
 
4.4%
y410
 
4.1%
b365
 
3.6%
Other values (13)2283
22.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8589
85.6%
Space Separator772
 
7.7%
Uppercase Letter588
 
5.9%
Other Punctuation89
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1263
14.7%
l1263
14.7%
r1043
12.1%
i896
10.4%
a677
7.9%
t626
7.3%
v440
 
5.1%
y410
 
4.8%
b365
 
4.2%
o315
 
3.7%
Other values (7)1291
15.0%
Uppercase Letter
ValueCountFrequency (%)
S222
37.8%
V185
31.5%
N178
30.3%
P3
 
0.5%
Space Separator
ValueCountFrequency (%)
772
100.0%
Other Punctuation
ValueCountFrequency (%)
/89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin9177
91.4%
Common861
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1263
13.8%
l1263
13.8%
r1043
11.4%
i896
9.8%
a677
 
7.4%
t626
 
6.8%
v440
 
4.8%
y410
 
4.5%
b365
 
4.0%
o315
 
3.4%
Other values (11)1879
20.5%
Common
ValueCountFrequency (%)
772
89.7%
/89
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII10038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e1263
12.6%
l1263
12.6%
r1043
10.4%
i896
 
8.9%
772
 
7.7%
a677
 
6.7%
t626
 
6.2%
v440
 
4.4%
y410
 
4.1%
b365
 
3.6%
Other values (13)2283
22.7%

aware_sm_res
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Moderately aware
128 
Very aware
119 
Slightly aware
117 
Not at all aware
76 
Extremely aware
59 

Length

Max length16
Median length15
Mean length13.98196393
Min length10

Characters and Unicode

Total characters6977
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowExtremely aware
2nd rowModerately aware
3rd rowExtremely aware
4th rowModerately aware
5th rowExtremely aware

Common Values

ValueCountFrequency (%)
Moderately aware128
25.7%
Very aware119
23.8%
Slightly aware117
23.4%
Not at all aware76
15.2%
Extremely aware59
11.8%

Length

2022-11-21T12:15:02.687846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:02.745160image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
aware499
43.4%
moderately128
 
11.1%
very119
 
10.3%
slightly117
 
10.2%
not76
 
6.6%
at76
 
6.6%
all76
 
6.6%
extremely59
 
5.1%

Most occurring characters

ValueCountFrequency (%)
a1278
18.3%
e992
14.2%
r805
11.5%
651
9.3%
l573
8.2%
w499
 
7.2%
t456
 
6.5%
y423
 
6.1%
o204
 
2.9%
M128
 
1.8%
Other values (10)968
13.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5827
83.5%
Space Separator651
 
9.3%
Uppercase Letter499
 
7.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1278
21.9%
e992
17.0%
r805
13.8%
l573
9.8%
w499
 
8.6%
t456
 
7.8%
y423
 
7.3%
o204
 
3.5%
d128
 
2.2%
i117
 
2.0%
Other values (4)352
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
M128
25.7%
V119
23.8%
S117
23.4%
N76
15.2%
E59
11.8%
Space Separator
ValueCountFrequency (%)
651
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6326
90.7%
Common651
 
9.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1278
20.2%
e992
15.7%
r805
12.7%
l573
9.1%
w499
 
7.9%
t456
 
7.2%
y423
 
6.7%
o204
 
3.2%
M128
 
2.0%
d128
 
2.0%
Other values (9)840
13.3%
Common
ValueCountFrequency (%)
651
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII6977
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1278
18.3%
e992
14.2%
r805
11.5%
651
9.3%
l573
8.2%
w499
 
7.2%
t456
 
6.5%
y423
 
6.1%
o204
 
2.9%
M128
 
1.8%
Other values (10)968
13.9%

aware_sm_advan
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size4.0 KiB

aware_sm_interact
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size4.0 KiB

aware_sm_use
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size4.0 KiB

ethic_appr
Categorical

HIGH CARDINALITY
UNIFORM

Distinct498
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
Ethics approval is needed for any research that involves human participants; their tissue and /or data to ensure that the dignity, rights, safety and well-being of all participants are the primary consideration of the research project.
 
2
The scope of the project and actions there in do not cross certain boundaries that may purposefully negatively affect participants as well as legal regulations and standard practices.
 
1
That they are going to use the information they receive appropriately. They are not going to manipulate and misuse what they gather.
 
1
It means that, in the opinion of the institution, the study and its methods are morally acceptable.
 
1
Ethical approval means getting clearance to obtain data from a research subject.
 
1
Other values (493)
493 

Length

Max length1026
Median length207
Mean length134.7935872
Min length15

Characters and Unicode

Total characters67262
Distinct characters66
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique497 ?
Unique (%)99.6%

Sample

1st rowThe scope of the project and actions there in do not cross certain boundaries that may purposefully negatively affect participants as well as legal regulations and standard practices.
2nd rowI think Ethical Approval means that the experiment is gathering data without harm or injury to people.
3rd rowResearchers focus on ethical standards towards those they gain data from. They need approval of their approach and receive methods.
4th rowI would think that using "ethical approval" means that the things others collect on social media sites would need to be honest and moral. Hopefully, there would be no under-handedness used in collecting information.
5th rowA set of rules of what to do and what to not do.

Common Values

ValueCountFrequency (%)
Ethics approval is needed for any research that involves human participants; their tissue and /or data to ensure that the dignity, rights, safety and well-being of all participants are the primary consideration of the research project.2
 
0.4%
The scope of the project and actions there in do not cross certain boundaries that may purposefully negatively affect participants as well as legal regulations and standard practices.1
 
0.2%
That they are going to use the information they receive appropriately. They are not going to manipulate and misuse what they gather.1
 
0.2%
It means that, in the opinion of the institution, the study and its methods are morally acceptable.1
 
0.2%
Ethical approval means getting clearance to obtain data from a research subject.1
 
0.2%
It means receiving approval from an IRB or other institution that has oversight over study approval. They make sure the studies to not hamr their subjects.1
 
0.2%
Ethical approval from the institution means they will act in a way that responsible and takes in to account the persons they are researching. 1
 
0.2%
Proof that the experiment is not done against people's wills and if people ask, all data will be deleted.1
 
0.2%
There is (or should be ) oversight from someone in charge, and who is ethical.1
 
0.2%
Morally correct.1
 
0.2%
Other values (488)488
97.8%

Length

2022-11-21T12:15:02.814374image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the736
 
6.5%
to474
 
4.2%
that434
 
3.8%
and295
 
2.6%
is249
 
2.2%
ethical249
 
2.2%
of227
 
2.0%
it217
 
1.9%
they212
 
1.9%
approval201
 
1.8%
Other values (1400)8064
71.0%

Most occurring characters

ValueCountFrequency (%)
11017
16.4%
e6612
 
9.8%
t6115
 
9.1%
a4947
 
7.4%
i4019
 
6.0%
o3823
 
5.7%
n3645
 
5.4%
s3457
 
5.1%
r3449
 
5.1%
h3300
 
4.9%
Other values (56)16878
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter54478
81.0%
Space Separator11017
 
16.4%
Other Punctuation966
 
1.4%
Uppercase Letter705
 
1.0%
Dash Punctuation36
 
0.1%
Open Punctuation22
 
< 0.1%
Close Punctuation22
 
< 0.1%
Final Punctuation8
 
< 0.1%
Control7
 
< 0.1%
Decimal Number1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e6612
12.1%
t6115
11.2%
a4947
 
9.1%
i4019
 
7.4%
o3823
 
7.0%
n3645
 
6.7%
s3457
 
6.3%
r3449
 
6.3%
h3300
 
6.1%
l2057
 
3.8%
Other values (16)13054
24.0%
Uppercase Letter
ValueCountFrequency (%)
I257
36.5%
T144
20.4%
E91
 
12.9%
A46
 
6.5%
B24
 
3.4%
R23
 
3.3%
M19
 
2.7%
W15
 
2.1%
P9
 
1.3%
S9
 
1.3%
Other values (15)68
 
9.6%
Other Punctuation
ValueCountFrequency (%)
.535
55.4%
,234
24.2%
'141
 
14.6%
"28
 
2.9%
/12
 
1.2%
?10
 
1.0%
;4
 
0.4%
:2
 
0.2%
Space Separator
ValueCountFrequency (%)
11017
100.0%
Dash Punctuation
ValueCountFrequency (%)
-36
100.0%
Open Punctuation
ValueCountFrequency (%)
(22
100.0%
Close Punctuation
ValueCountFrequency (%)
)22
100.0%
Final Punctuation
ValueCountFrequency (%)
8
100.0%
Control
ValueCountFrequency (%)
7
100.0%
Decimal Number
ValueCountFrequency (%)
31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin55183
82.0%
Common12079
 
18.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e6612
12.0%
t6115
11.1%
a4947
 
9.0%
i4019
 
7.3%
o3823
 
6.9%
n3645
 
6.6%
s3457
 
6.3%
r3449
 
6.3%
h3300
 
6.0%
l2057
 
3.7%
Other values (41)13759
24.9%
Common
ValueCountFrequency (%)
11017
91.2%
.535
 
4.4%
,234
 
1.9%
'141
 
1.2%
-36
 
0.3%
"28
 
0.2%
(22
 
0.2%
)22
 
0.2%
/12
 
0.1%
?10
 
0.1%
Other values (5)22
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII67254
> 99.9%
Punctuation8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11017
16.4%
e6612
 
9.8%
t6115
 
9.1%
a4947
 
7.4%
i4019
 
6.0%
o3823
 
5.7%
n3645
 
5.4%
s3457
 
5.1%
r3449
 
5.1%
h3300
 
4.9%
Other values (55)16870
25.1%
Punctuation
ValueCountFrequency (%)
8
100.0%

study_1_ethic_acc
Categorical

HIGH CORRELATION
MISSING

Distinct4
Distinct (%)1.2%
Missing157
Missing (%)31.5%
Memory size839.0 B
Somewhat acceptable
144 
Somewhat unacceptable
81 
Neutral
62 
Completey unacceptable
55 

Length

Max length22
Median length21
Mean length17.78070175
Min length7

Characters and Unicode

Total characters6081
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNeutral
2nd rowNeutral
3rd rowSomewhat acceptable
4th rowCompletey unacceptable
5th rowNeutral

Common Values

ValueCountFrequency (%)
Somewhat acceptable144
28.9%
Somewhat unacceptable81
16.2%
Neutral62
 
12.4%
Completey unacceptable55
 
11.0%
(Missing)157
31.5%

Length

2022-11-21T12:15:02.877170image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:02.937476image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
somewhat225
36.2%
acceptable144
23.2%
unacceptable136
21.9%
neutral62
 
10.0%
completey55
 
8.8%

Most occurring characters

ValueCountFrequency (%)
e957
15.7%
a847
13.9%
t622
10.2%
c560
9.2%
l397
 
6.5%
p335
 
5.5%
m280
 
4.6%
b280
 
4.6%
280
 
4.6%
o280
 
4.6%
Other values (9)1243
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5459
89.8%
Uppercase Letter342
 
5.6%
Space Separator280
 
4.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e957
17.5%
a847
15.5%
t622
11.4%
c560
10.3%
l397
7.3%
p335
 
6.1%
m280
 
5.1%
b280
 
5.1%
o280
 
5.1%
h225
 
4.1%
Other values (5)676
12.4%
Uppercase Letter
ValueCountFrequency (%)
S225
65.8%
N62
 
18.1%
C55
 
16.1%
Space Separator
ValueCountFrequency (%)
280
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5801
95.4%
Common280
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e957
16.5%
a847
14.6%
t622
10.7%
c560
9.7%
l397
 
6.8%
p335
 
5.8%
m280
 
4.8%
b280
 
4.8%
o280
 
4.8%
S225
 
3.9%
Other values (8)1018
17.5%
Common
ValueCountFrequency (%)
280
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII6081
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e957
15.7%
a847
13.9%
t622
10.2%
c560
9.2%
l397
 
6.5%
p335
 
5.5%
m280
 
4.6%
b280
 
4.6%
280
 
4.6%
o280
 
4.6%
Other values (9)1243
20.4%

study_1_conc
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct253
Distinct (%)98.4%
Missing242
Missing (%)48.5%
Memory size4.0 KiB
None
 
4
participants were not aware they were part of a research study
 
2
the anonymity of twitter already draws an inordinate amount of these kinds of posters
 
1
again, not letting the twitter user that their data is being used does not seem honest and above boardi even have a hard time with the ethics of the researchers on prolific when they state, "we're sorry, but we had to deceive you", it reeks of manipulation that i will never understand is needed i am sure it is "for the greater good", but it still feels dishonest and not trustworthy
 
1
once again, users were not informed that they were part of a studysecondly, based on my experience from reading hate speech, i seriously doubt that a majority of people deleted their "hate" post after reading an empathetic response from a researcher the posts that i most often see deleted are those where someone has posted a reply that shows that the original poster misinterpreted some facts and were provided with correct info the rest of them often enjoy their roles as trolls and do not bother deleting much of anything
 
1
Other values (248)
248 

Length

Max length809
Median length182
Mean length132.3774319
Min length4

Characters and Unicode

Total characters34021
Distinct characters45
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique251 ?
Unique (%)97.7%

Sample

1st rowno concerns i would have loved to partake in this study in terms of watching the results
2nd rowi feel if people know they are being judged they will act, speak, or write differently than if they don't know they are being analyzed
3rd roweasy enough for an outside government to try copying such a study with the sole purpose of creating much more polarization, hate, etc not that it hasn't been tried and tested perhaps innumerable times by all types of foreign or domestic entities as far as we know no actual study would have really been needed to know that using a type of marketing manipulation could alter the recipients moodlevels of concernanxietyhateetc
4th rowthe participants were not were of this research study being conducted therefore it is unethical
5th rowdeceitful and lazy even with the positive result how you get results matter

Common Values

ValueCountFrequency (%)
None4
 
0.8%
participants were not aware they were part of a research study2
 
0.4%
the anonymity of twitter already draws an inordinate amount of these kinds of posters1
 
0.2%
again, not letting the twitter user that their data is being used does not seem honest and above boardi even have a hard time with the ethics of the researchers on prolific when they state, "we're sorry, but we had to deceive you", it reeks of manipulation that i will never understand is needed i am sure it is "for the greater good", but it still feels dishonest and not trustworthy1
 
0.2%
once again, users were not informed that they were part of a studysecondly, based on my experience from reading hate speech, i seriously doubt that a majority of people deleted their "hate" post after reading an empathetic response from a researcher the posts that i most often see deleted are those where someone has posted a reply that shows that the original poster misinterpreted some facts and were provided with correct info the rest of them often enjoy their roles as trolls and do not bother deleting much of anything1
 
0.2%
although i do not feel like this is a huge ethical issue, i do feel that any type of misleading intentionally might sit on an ethical borderline1
 
0.2%
using people's data without consent for a study seems unethical1
 
0.2%
same as before, this is all public and anyone can do these things, so i have no issue with it1
 
0.2%
in order for the study's results to be accurate users couldn't know that the researchers were running a study1
 
0.2%
i feel its quit acceptable posting something that reduces the hate in general also since it helps one to rethink their post1
 
0.2%
Other values (243)243
48.7%
(Missing)242
48.5%

Length

2022-11-21T12:15:03.010541image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the296
 
4.9%
to158
 
2.6%
that156
 
2.6%
i148
 
2.5%
a141
 
2.3%
of134
 
2.2%
is125
 
2.1%
they120
 
2.0%
study107
 
1.8%
not106
 
1.8%
Other values (1068)4546
75.3%

Most occurring characters

ValueCountFrequency (%)
5808
17.1%
e3735
11.0%
t3128
 
9.2%
a2221
 
6.5%
i2113
 
6.2%
o1871
 
5.5%
s1834
 
5.4%
n1794
 
5.3%
h1584
 
4.7%
r1488
 
4.4%
Other values (35)8445
24.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter27813
81.8%
Space Separator5808
 
17.1%
Other Punctuation352
 
1.0%
Dash Punctuation13
 
< 0.1%
Decimal Number9
 
< 0.1%
Close Punctuation8
 
< 0.1%
Final Punctuation7
 
< 0.1%
Open Punctuation7
 
< 0.1%
Uppercase Letter4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e3735
13.4%
t3128
11.2%
a2221
 
8.0%
i2113
 
7.6%
o1871
 
6.7%
s1834
 
6.6%
n1794
 
6.5%
h1584
 
5.7%
r1488
 
5.4%
d982
 
3.5%
Other values (16)7063
25.4%
Other Punctuation
ValueCountFrequency (%)
,161
45.7%
'128
36.4%
"38
 
10.8%
?16
 
4.5%
!5
 
1.4%
:3
 
0.9%
;1
 
0.3%
Decimal Number
ValueCountFrequency (%)
05
55.6%
22
 
22.2%
11
 
11.1%
41
 
11.1%
Close Punctuation
ValueCountFrequency (%)
)6
75.0%
]2
 
25.0%
Open Punctuation
ValueCountFrequency (%)
(5
71.4%
[2
 
28.6%
Space Separator
ValueCountFrequency (%)
5808
100.0%
Dash Punctuation
ValueCountFrequency (%)
-13
100.0%
Final Punctuation
ValueCountFrequency (%)
7
100.0%
Uppercase Letter
ValueCountFrequency (%)
N4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin27817
81.8%
Common6204
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e3735
13.4%
t3128
11.2%
a2221
 
8.0%
i2113
 
7.6%
o1871
 
6.7%
s1834
 
6.6%
n1794
 
6.4%
h1584
 
5.7%
r1488
 
5.3%
d982
 
3.5%
Other values (17)7067
25.4%
Common
ValueCountFrequency (%)
5808
93.6%
,161
 
2.6%
'128
 
2.1%
"38
 
0.6%
?16
 
0.3%
-13
 
0.2%
7
 
0.1%
)6
 
0.1%
05
 
0.1%
(5
 
0.1%
Other values (8)17
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII34014
> 99.9%
Punctuation7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5808
17.1%
e3735
11.0%
t3128
 
9.2%
a2221
 
6.5%
i2113
 
6.2%
o1871
 
5.5%
s1834
 
5.4%
n1794
 
5.3%
h1584
 
4.7%
r1488
 
4.4%
Other values (34)8438
24.8%
Punctuation
ValueCountFrequency (%)
7
100.0%

study_1_add_info
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING
UNIFORM

Distinct90
Distinct (%)97.8%
Missing407
Missing (%)81.6%
Memory size4.0 KiB
None
 
3
the fact that the fake accounts were used to try and suppress hate speech, makes it more ethical in my opinion
 
1
i would love to see the message that was posted by the researches to what was posted in order to see what was said and how it was worded
 
1
no one is getting harmed or misinformed here the study is actually trying to help people, so i think it's acceptable even though they did not know they were in a study
 
1
i would like to know more about what the replies actually said
 
1
Other values (85)
85 

Length

Max length312
Median length112.5
Mean length97.91304348
Min length4

Characters and Unicode

Total characters9008
Distinct characters40
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique89 ?
Unique (%)96.7%

Sample

1st rowi would be interested to know what kind of messages they sent the hate speech users that got them to change their minds
2nd rowfull disclosure of intent of researchers
3rd rowsee comments from previous studies
4th rowit kind of depends on what the replies were i guess i'm not 100% sure if it is ethically acceptable because i don't know what the researchers said i would assume it was ok though
5th rowinteresting idea for a study i'd have to think long and hard about its value i no longer use twitter

Common Values

ValueCountFrequency (%)
None3
 
0.6%
the fact that the fake accounts were used to try and suppress hate speech, makes it more ethical in my opinion1
 
0.2%
i would love to see the message that was posted by the researches to what was posted in order to see what was said and how it was worded1
 
0.2%
no one is getting harmed or misinformed here the study is actually trying to help people, so i think it's acceptable even though they did not know they were in a study1
 
0.2%
i would like to know more about what the replies actually said1
 
0.2%
i would want to know the various content of the messages created by the researchers1
 
0.2%
people in the study should have been aware or notified people before going forward with the study1
 
0.2%
no, no more information is needed the researchers lied, period1
 
0.2%
the results are illuminating in so far as it goes, but i would like to know how long lasting the effect of the empathetic messages were just the 4 days, longer, permanent change?1
 
0.2%
if users gave their consent before their data was published, this would be totally acceptable1
 
0.2%
Other values (80)80
 
16.0%
(Missing)407
81.6%

Length

2022-11-21T12:15:03.088538image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the93
 
5.8%
to73
 
4.5%
i44
 
2.7%
it33
 
2.1%
they33
 
2.1%
of32
 
2.0%
would28
 
1.7%
were26
 
1.6%
if23
 
1.4%
was23
 
1.4%
Other values (483)1200
74.6%

Most occurring characters

ValueCountFrequency (%)
1532
17.0%
e1038
11.5%
t806
 
8.9%
o555
 
6.2%
a545
 
6.1%
i511
 
5.7%
s502
 
5.6%
n436
 
4.8%
h414
 
4.6%
r376
 
4.2%
Other values (30)2293
25.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7362
81.7%
Space Separator1532
 
17.0%
Other Punctuation80
 
0.9%
Decimal Number12
 
0.1%
Dash Punctuation6
 
0.1%
Open Punctuation6
 
0.1%
Close Punctuation6
 
0.1%
Uppercase Letter3
 
< 0.1%
Final Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1038
14.1%
t806
10.9%
o555
 
7.5%
a545
 
7.4%
i511
 
6.9%
s502
 
6.8%
n436
 
5.9%
h414
 
5.6%
r376
 
5.1%
l272
 
3.7%
Other values (16)1907
25.9%
Other Punctuation
ValueCountFrequency (%)
,44
55.0%
'22
27.5%
?11
 
13.8%
"2
 
2.5%
%1
 
1.2%
Decimal Number
ValueCountFrequency (%)
08
66.7%
13
 
25.0%
41
 
8.3%
Space Separator
ValueCountFrequency (%)
1532
100.0%
Dash Punctuation
ValueCountFrequency (%)
-6
100.0%
Open Punctuation
ValueCountFrequency (%)
(6
100.0%
Close Punctuation
ValueCountFrequency (%)
)6
100.0%
Uppercase Letter
ValueCountFrequency (%)
N3
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7365
81.8%
Common1643
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1038
14.1%
t806
10.9%
o555
 
7.5%
a545
 
7.4%
i511
 
6.9%
s502
 
6.8%
n436
 
5.9%
h414
 
5.6%
r376
 
5.1%
l272
 
3.7%
Other values (17)1910
25.9%
Common
ValueCountFrequency (%)
1532
93.2%
,44
 
2.7%
'22
 
1.3%
?11
 
0.7%
08
 
0.5%
-6
 
0.4%
(6
 
0.4%
)6
 
0.4%
13
 
0.2%
"2
 
0.1%
Other values (3)3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII9007
> 99.9%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1532
17.0%
e1038
11.5%
t806
 
8.9%
o555
 
6.2%
a545
 
6.1%
i511
 
5.7%
s502
 
5.6%
n436
 
4.8%
h414
 
4.6%
r376
 
4.2%
Other values (29)2292
25.4%
Punctuation
ValueCountFrequency (%)
1
100.0%

study_2_ethic_acc
Categorical

HIGH CORRELATION
MISSING

Distinct3
Distinct (%)0.9%
Missing173
Missing (%)34.7%
Memory size839.0 B
Somewhat acceptable
133 
Somewhat unacceptable
111 
Neutral
82 

Length

Max length21
Median length19
Mean length16.66257669
Min length7

Characters and Unicode

Total characters5432
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNeutral
2nd rowSomewhat acceptable
3rd rowSomewhat unacceptable
4th rowSomewhat acceptable
5th rowNeutral

Common Values

ValueCountFrequency (%)
Somewhat acceptable133
26.7%
Somewhat unacceptable111
22.2%
Neutral82
16.4%
(Missing)173
34.7%

Length

2022-11-21T12:15:03.149031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:03.211226image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
somewhat244
42.8%
acceptable133
23.3%
unacceptable111
19.5%
neutral82
 
14.4%

Most occurring characters

ValueCountFrequency (%)
e814
15.0%
a814
15.0%
t570
10.5%
c488
9.0%
l326
 
6.0%
S244
 
4.5%
b244
 
4.5%
p244
 
4.5%
244
 
4.5%
o244
 
4.5%
Other values (7)1200
22.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4862
89.5%
Uppercase Letter326
 
6.0%
Space Separator244
 
4.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e814
16.7%
a814
16.7%
t570
11.7%
c488
10.0%
l326
6.7%
b244
 
5.0%
p244
 
5.0%
o244
 
5.0%
h244
 
5.0%
w244
 
5.0%
Other values (4)630
13.0%
Uppercase Letter
ValueCountFrequency (%)
S244
74.8%
N82
 
25.2%
Space Separator
ValueCountFrequency (%)
244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5188
95.5%
Common244
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e814
15.7%
a814
15.7%
t570
11.0%
c488
9.4%
l326
 
6.3%
S244
 
4.7%
b244
 
4.7%
p244
 
4.7%
o244
 
4.7%
h244
 
4.7%
Other values (6)956
18.4%
Common
ValueCountFrequency (%)
244
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5432
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e814
15.0%
a814
15.0%
t570
10.5%
c488
9.0%
l326
 
6.0%
S244
 
4.5%
b244
 
4.5%
p244
 
4.5%
244
 
4.5%
o244
 
4.5%
Other values (7)1200
22.1%

study_2_conc
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct274
Distinct (%)100.0%
Missing225
Missing (%)45.1%
Memory size4.0 KiB
they probably should have been told at some point that it was a research study
 
1
using bots on unaware citizens without their consent would seem to be unethical not to mention the famous case of the linkedin founder funding the troll farm "research" to manipulate the alabama special election unethical if not criminal
 
1
i understand that when posting on social media, you content is at the will of everyone but, i feel if researcher or analytics are going to be performed against your inputs, you should be notified
 
1
participants were uninformed the idea of private messaging someone is also bothersome to me
 
1
in order to get accurate results for the study, the users couldn't know that they were being surveyed
 
1
Other values (269)
269 

Length

Max length640
Median length181.5
Mean length124.7956204
Min length32

Characters and Unicode

Total characters34194
Distinct characters41
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique274 ?
Unique (%)100.0%

Sample

1st rowgoing to the poster privately provided opportunity for change without the possibly of increased toxicity from users i prefer this method over commenting the "correct information"
2nd rowi feel as though, in the above case, users had a choice to respond or not so i think it was honest
3rd rowit's perfectly within someone's right to send someone else a message on any platform, therefore i believe this study was acceptable
4th rowthis is unethical because those involved were not adequately informed of the researchers intent
5th rowit’s deception no matter how you look at it and it’s lazy as twitter users are not a representative sample of reality

Common Values

ValueCountFrequency (%)
they probably should have been told at some point that it was a research study1
 
0.2%
using bots on unaware citizens without their consent would seem to be unethical not to mention the famous case of the linkedin founder funding the troll farm "research" to manipulate the alabama special election unethical if not criminal1
 
0.2%
i understand that when posting on social media, you content is at the will of everyone but, i feel if researcher or analytics are going to be performed against your inputs, you should be notified1
 
0.2%
participants were uninformed the idea of private messaging someone is also bothersome to me1
 
0.2%
in order to get accurate results for the study, the users couldn't know that they were being surveyed1
 
0.2%
again, these researchers are just observing public activity everything they did was allowable by anyone, so i have no issues with this1
 
0.2%
i feel like because the twitter users were intentionally mislead and the fact that it was a private message makes it less ethical1
 
0.2%
the users were not aware that they were part of the study and very importantly, the study was limited to conservative users only i find that objectionable and wide open for manipulating data and drawing misleading conclusions1
 
0.2%
a private message doesn't require the user to respond, they can simply ignore if they are not interested in sharing their opinion there's no harm in asking somebody their opinion, but the information gathered needs to be anonymous1
 
0.2%
i really do not like tricks or deception on any level1
 
0.2%
Other values (264)264
52.9%
(Missing)225
45.1%

Length

2022-11-21T12:15:03.281678image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the285
 
4.8%
to170
 
2.8%
i142
 
2.4%
a142
 
2.4%
of135
 
2.3%
they132
 
2.2%
that125
 
2.1%
not122
 
2.0%
is102
 
1.7%
study102
 
1.7%
Other values (1022)4508
75.6%

Most occurring characters

ValueCountFrequency (%)
5722
16.7%
e3651
10.7%
t3083
 
9.0%
a2255
 
6.6%
i2193
 
6.4%
s1951
 
5.7%
n1944
 
5.7%
o1926
 
5.6%
r1576
 
4.6%
h1476
 
4.3%
Other values (31)8417
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter28083
82.1%
Space Separator5722
 
16.7%
Other Punctuation333
 
1.0%
Dash Punctuation32
 
0.1%
Close Punctuation9
 
< 0.1%
Open Punctuation8
 
< 0.1%
Final Punctuation4
 
< 0.1%
Decimal Number3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e3651
13.0%
t3083
11.0%
a2255
 
8.0%
i2193
 
7.8%
s1951
 
6.9%
n1944
 
6.9%
o1926
 
6.9%
r1576
 
5.6%
h1476
 
5.3%
d996
 
3.5%
Other values (16)7032
25.0%
Other Punctuation
ValueCountFrequency (%)
,142
42.6%
'131
39.3%
"34
 
10.2%
?21
 
6.3%
;3
 
0.9%
!1
 
0.3%
:1
 
0.3%
Decimal Number
ValueCountFrequency (%)
21
33.3%
41
33.3%
11
33.3%
Space Separator
ValueCountFrequency (%)
5722
100.0%
Dash Punctuation
ValueCountFrequency (%)
-32
100.0%
Close Punctuation
ValueCountFrequency (%)
)9
100.0%
Open Punctuation
ValueCountFrequency (%)
(8
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin28083
82.1%
Common6111
 
17.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e3651
13.0%
t3083
11.0%
a2255
 
8.0%
i2193
 
7.8%
s1951
 
6.9%
n1944
 
6.9%
o1926
 
6.9%
r1576
 
5.6%
h1476
 
5.3%
d996
 
3.5%
Other values (16)7032
25.0%
Common
ValueCountFrequency (%)
5722
93.6%
,142
 
2.3%
'131
 
2.1%
"34
 
0.6%
-32
 
0.5%
?21
 
0.3%
)9
 
0.1%
(8
 
0.1%
4
 
0.1%
;3
 
< 0.1%
Other values (5)5
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII34190
> 99.9%
Punctuation4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5722
16.7%
e3651
10.7%
t3083
 
9.0%
a2255
 
6.6%
i2193
 
6.4%
s1951
 
5.7%
n1944
 
5.7%
o1926
 
5.6%
r1576
 
4.6%
h1476
 
4.3%
Other values (30)8413
24.6%
Punctuation
ValueCountFrequency (%)
4
100.0%

study_2_add_info
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING
UNIFORM

Distinct91
Distinct (%)96.8%
Missing405
Missing (%)81.2%
Memory size4.0 KiB
None
 
4
telling the users about the study after it was completed would have made it a bit more ethical
 
1
i feel that the individuals who were messaged by the researches should be informed of why they are being contacted, and that information that they provide will be used in an academic study if this is done, then fair play, but it seems that was not done here
 
1
the part about not telling people that they are part of a research study because misinformation is spread too much and now the participants will likely believe things that are untrue
 
1
of course there is always the concern about funding and bias i wonder if because of the private messaging whether there was any supplemental back-and-forth between the unwilling participant and someone behind the study which has potential for some form of abuse or corruption of data
 
1
Other values (86)
86 

Length

Max length399
Median length139.5
Mean length99.82978723
Min length4

Characters and Unicode

Total characters9384
Distinct characters42
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)95.7%

Sample

1st rowconcerns over the possibility of the researchers having their own political agenda yet fake news is a major problem what social media really is when mass sharing news (political news), is simple propaganda from the left and right
2nd rowfull disclosure if research intent
3rd rowhow do you pick a representative sample on a non representative platform
4th rowjust informing people about the study
5th rowif they used deceiving or untruthful tactics or anything of that nature

Common Values

ValueCountFrequency (%)
None4
 
0.8%
telling the users about the study after it was completed would have made it a bit more ethical1
 
0.2%
i feel that the individuals who were messaged by the researches should be informed of why they are being contacted, and that information that they provide will be used in an academic study if this is done, then fair play, but it seems that was not done here1
 
0.2%
the part about not telling people that they are part of a research study because misinformation is spread too much and now the participants will likely believe things that are untrue1
 
0.2%
of course there is always the concern about funding and bias i wonder if because of the private messaging whether there was any supplemental back-and-forth between the unwilling participant and someone behind the study which has potential for some form of abuse or corruption of data1
 
0.2%
i'd like to know the exact messages they are sending1
 
0.2%
you need people consent to do a study on them1
 
0.2%
if i found out who labeled the misinformation(same people who labeled hunter's laptop?) and if i found out that left leaning were also studied1
 
0.2%
they aren't deceiving the users, they just aren't keeping them informed1
 
0.2%
i understand the need to not inform participants and i suppose since all information gathered was in a public shared space there is no presumption of privacy1
 
0.2%
Other values (81)81
 
16.2%
(Missing)405
81.2%

Length

2022-11-21T12:15:03.361586image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the99
 
6.0%
i45
 
2.7%
to39
 
2.4%
of36
 
2.2%
they36
 
2.2%
a35
 
2.1%
if34
 
2.1%
would31
 
1.9%
that30
 
1.8%
study27
 
1.6%
Other values (502)1230
74.9%

Most occurring characters

ValueCountFrequency (%)
1556
16.6%
e993
10.6%
t813
 
8.7%
a598
 
6.4%
i590
 
6.3%
o536
 
5.7%
s524
 
5.6%
n483
 
5.1%
r433
 
4.6%
h403
 
4.3%
Other values (32)2455
26.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7722
82.3%
Space Separator1556
 
16.6%
Other Punctuation68
 
0.7%
Dash Punctuation10
 
0.1%
Open Punctuation8
 
0.1%
Close Punctuation8
 
0.1%
Decimal Number7
 
0.1%
Uppercase Letter4
 
< 0.1%
Final Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e993
12.9%
t813
10.5%
a598
 
7.7%
i590
 
7.6%
o536
 
6.9%
s524
 
6.8%
n483
 
6.3%
r433
 
5.6%
h403
 
5.2%
l319
 
4.1%
Other values (16)2030
26.3%
Other Punctuation
ValueCountFrequency (%)
,32
47.1%
'18
26.5%
?9
 
13.2%
"8
 
11.8%
;1
 
1.5%
Decimal Number
ValueCountFrequency (%)
04
57.1%
12
28.6%
21
 
14.3%
Open Punctuation
ValueCountFrequency (%)
(7
87.5%
[1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
)7
87.5%
]1
 
12.5%
Space Separator
ValueCountFrequency (%)
1556
100.0%
Dash Punctuation
ValueCountFrequency (%)
-10
100.0%
Uppercase Letter
ValueCountFrequency (%)
N4
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7726
82.3%
Common1658
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e993
12.9%
t813
 
10.5%
a598
 
7.7%
i590
 
7.6%
o536
 
6.9%
s524
 
6.8%
n483
 
6.3%
r433
 
5.6%
h403
 
5.2%
l319
 
4.1%
Other values (17)2034
26.3%
Common
ValueCountFrequency (%)
1556
93.8%
,32
 
1.9%
'18
 
1.1%
-10
 
0.6%
?9
 
0.5%
"8
 
0.5%
(7
 
0.4%
)7
 
0.4%
04
 
0.2%
12
 
0.1%
Other values (5)5
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII9383
> 99.9%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1556
16.6%
e993
10.6%
t813
 
8.7%
a598
 
6.4%
i590
 
6.3%
o536
 
5.7%
s524
 
5.6%
n483
 
5.1%
r433
 
4.6%
h403
 
4.3%
Other values (31)2454
26.2%
Punctuation
ValueCountFrequency (%)
1
100.0%

study_3_ethic_acc
Categorical

HIGH CORRELATION
MISSING

Distinct3
Distinct (%)1.3%
Missing270
Missing (%)54.1%
Memory size839.0 B
Somewhat acceptable
134 
Neutral
56 
Somewhat unacceptable
39 

Length

Max length21
Median length19
Mean length16.40611354
Min length7

Characters and Unicode

Total characters3757
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNeutral
2nd rowSomewhat acceptable
3rd rowSomewhat acceptable
4th rowNeutral
5th rowNeutral

Common Values

ValueCountFrequency (%)
Somewhat acceptable134
26.9%
Neutral56
 
11.2%
Somewhat unacceptable39
 
7.8%
(Missing)270
54.1%

Length

2022-11-21T12:15:03.426726image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:03.481162image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
somewhat173
43.0%
acceptable134
33.3%
neutral56
 
13.9%
unacceptable39
 
9.7%

Most occurring characters

ValueCountFrequency (%)
e575
15.3%
a575
15.3%
t402
10.7%
c346
9.2%
l229
 
6.1%
S173
 
4.6%
b173
 
4.6%
p173
 
4.6%
173
 
4.6%
o173
 
4.6%
Other values (7)765
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3355
89.3%
Uppercase Letter229
 
6.1%
Space Separator173
 
4.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e575
17.1%
a575
17.1%
t402
12.0%
c346
10.3%
l229
 
6.8%
b173
 
5.2%
p173
 
5.2%
o173
 
5.2%
h173
 
5.2%
w173
 
5.2%
Other values (4)363
10.8%
Uppercase Letter
ValueCountFrequency (%)
S173
75.5%
N56
 
24.5%
Space Separator
ValueCountFrequency (%)
173
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3584
95.4%
Common173
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e575
16.0%
a575
16.0%
t402
11.2%
c346
9.7%
l229
 
6.4%
S173
 
4.8%
b173
 
4.8%
p173
 
4.8%
o173
 
4.8%
h173
 
4.8%
Other values (6)592
16.5%
Common
ValueCountFrequency (%)
173
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3757
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e575
15.3%
a575
15.3%
t402
10.7%
c346
9.2%
l229
 
6.1%
S173
 
4.6%
b173
 
4.6%
p173
 
4.6%
173
 
4.6%
o173
 
4.6%
Other values (7)765
20.4%

study_3_conc
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct227
Distinct (%)99.6%
Missing271
Missing (%)54.3%
Memory size4.0 KiB
None
 
2
this is more acceptable because participants are informed ahead of time about the use of their data
 
1
the study was completely transparent
 
1
most importantly, the researchers got the approval of their users first at least the users were aware that they were taking part in a study even though, in my mind, they were severely underpaid
 
1
i feel that since all the participants were informed of the process and offered compensation and they willingly participated, there are not any unethical practices used in the process
 
1
Other values (222)
222 

Length

Max length719
Median length152.5
Mean length126.0175439
Min length4

Characters and Unicode

Total characters28732
Distinct characters48
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique226 ?
Unique (%)99.1%

Sample

1st rowi find this is ethical as long as participants were fully aware of what was being monitored the results are interesting! no concerns
2nd rowas long as the facebook users were informed that they would be in a study i feel it is fair it was up to the users whether they wanted to participate or not also, they were encouraged, but not actually made to like the facebook study
3rd rowthe web extension being used was invasive, even if it was used with consent the people participating in the study are not educated enough on exactly how much information the web extension was taking
4th rowthe researchers seem in some ways to try manipulating political viewpoints in a segment of the population for the sake of science
5th rowpeople willingly consented to being part of the research study, so i believe the study was completely acceptable

Common Values

ValueCountFrequency (%)
None2
 
0.4%
this is more acceptable because participants are informed ahead of time about the use of their data1
 
0.2%
the study was completely transparent1
 
0.2%
most importantly, the researchers got the approval of their users first at least the users were aware that they were taking part in a study even though, in my mind, they were severely underpaid1
 
0.2%
i feel that since all the participants were informed of the process and offered compensation and they willingly participated, there are not any unethical practices used in the process1
 
0.2%
my only concern is that the browser extension allowed researches to see all of the users' posts i think this would be okay if it was explicitly consented to by the participants, though the above doesn't specify1
 
0.2%
the researchers gave users an option whether they wanted to be a part of the study or not1
 
0.2%
this is clearly trying to skew how people view certain topics in social media if thousands of people are not "liking" posts organically it makes post engagement fake especially if this was something that was more related to fake news, and they receive 30k likes, that is problematic1
 
0.2%
participants self selected and were informed the option to continue to participate in the research was at their discretion i find no ethical concerns with this methodi am curious about how the researchers determined that the original political views were unchanged even in the face of more exposure to opposing news sources especially since the participants were voluntarily visiting the sites with more frequency and had less negative views about them1
 
0.2%
i see no real ethical issue with this statement that is shown above1
 
0.2%
Other values (217)217
43.5%
(Missing)271
54.3%

Length

2022-11-21T12:15:03.548846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the334
 
6.6%
to164
 
3.3%
i128
 
2.5%
of120
 
2.4%
they104
 
2.1%
were104
 
2.1%
and98
 
1.9%
a97
 
1.9%
that96
 
1.9%
study92
 
1.8%
Other values (894)3709
73.5%

Most occurring characters

ValueCountFrequency (%)
4835
16.8%
e3175
11.1%
t2625
 
9.1%
i1851
 
6.4%
a1826
 
6.4%
o1655
 
5.8%
s1582
 
5.5%
n1522
 
5.3%
r1337
 
4.7%
h1211
 
4.2%
Other values (38)7113
24.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter23632
82.2%
Space Separator4835
 
16.8%
Other Punctuation214
 
0.7%
Decimal Number20
 
0.1%
Dash Punctuation10
 
< 0.1%
Currency Symbol8
 
< 0.1%
Open Punctuation5
 
< 0.1%
Close Punctuation5
 
< 0.1%
Uppercase Letter2
 
< 0.1%
Final Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e3175
13.4%
t2625
11.1%
i1851
 
7.8%
a1826
 
7.7%
o1655
 
7.0%
s1582
 
6.7%
n1522
 
6.4%
r1337
 
5.7%
h1211
 
5.1%
l835
 
3.5%
Other values (16)6013
25.4%
Other Punctuation
ValueCountFrequency (%)
,127
59.3%
'60
28.0%
"12
 
5.6%
?10
 
4.7%
!2
 
0.9%
%1
 
0.5%
;1
 
0.5%
:1
 
0.5%
Decimal Number
ValueCountFrequency (%)
06
30.0%
56
30.0%
84
20.0%
32
 
10.0%
11
 
5.0%
21
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
-8
80.0%
2
 
20.0%
Space Separator
ValueCountFrequency (%)
4835
100.0%
Currency Symbol
ValueCountFrequency (%)
$8
100.0%
Open Punctuation
ValueCountFrequency (%)
(5
100.0%
Close Punctuation
ValueCountFrequency (%)
)5
100.0%
Uppercase Letter
ValueCountFrequency (%)
N2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin23634
82.3%
Common5098
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e3175
13.4%
t2625
11.1%
i1851
 
7.8%
a1826
 
7.7%
o1655
 
7.0%
s1582
 
6.7%
n1522
 
6.4%
r1337
 
5.7%
h1211
 
5.1%
l835
 
3.5%
Other values (17)6015
25.5%
Common
ValueCountFrequency (%)
4835
94.8%
,127
 
2.5%
'60
 
1.2%
"12
 
0.2%
?10
 
0.2%
$8
 
0.2%
-8
 
0.2%
06
 
0.1%
56
 
0.1%
(5
 
0.1%
Other values (11)21
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII28729
> 99.9%
Punctuation3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4835
16.8%
e3175
11.1%
t2625
 
9.1%
i1851
 
6.4%
a1826
 
6.4%
o1655
 
5.8%
s1582
 
5.5%
n1522
 
5.3%
r1337
 
4.7%
h1211
 
4.2%
Other values (36)7110
24.7%
Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%

study_3_add_info
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING
UNIFORM

Distinct76
Distinct (%)100.0%
Missing423
Missing (%)84.8%
Memory size4.0 KiB
if the researchers just tracked the people that signed up for the study without asking them first
 
1
i'm not sure who would want the results, but to either take money or get the results most would take the money other survey makers would take the results just because it's free information
 
1
i think the information is going to be skewed based upon what the user thinks the researcher is looking for they are also more likely to click on political sites because they want to make sure that the researcher is gathering enough data from them
 
1
what was collected by the extension would be valuable
 
1
so many ways for people to get manipulated with these studies i agree with this one being one of the good ones
 
1
Other values (71)
71 

Length

Max length298
Median length107.5
Mean length112.7763158
Min length4

Characters and Unicode

Total characters8571
Distinct characters41
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)100.0%

Sample

1st rowmaking the source code for the web extension publicly available to have complete transparency over what the extension was doing
2nd rowsince the study is revealed as a study, i think it’s ethical, but mostly nonsensical
3rd rowif the browser extension tracked anything besides what is stated it would be unacceptable
4th rowglad to hear the participants know about the research they will be a part of & be rewarded for their time
5th rowsince they were aware they were taking part in a study it all seems above board i would generally prefer not to have to install anything

Common Values

ValueCountFrequency (%)
if the researchers just tracked the people that signed up for the study without asking them first1
 
0.2%
i'm not sure who would want the results, but to either take money or get the results most would take the money other survey makers would take the results just because it's free information1
 
0.2%
i think the information is going to be skewed based upon what the user thinks the researcher is looking for they are also more likely to click on political sites because they want to make sure that the researcher is gathering enough data from them1
 
0.2%
what was collected by the extension would be valuable1
 
0.2%
so many ways for people to get manipulated with these studies i agree with this one being one of the good ones1
 
0.2%
everyone was informed, so i think this is a good study1
 
0.2%
i think the level of compensation was woefully low which could potentially have skewed the results based on who would possibly participate for such a small amount of money or some lottery chance(i notice that the word acceptable is misspelled in the completely acceptable option)1
 
0.2%
i would just want to be sure that participants are well informed about the data collection process1
 
0.2%
i would find it unethical if the researchers mislead the participants or if they used the browser extension to gather information that the participants were not informed of1
 
0.2%
data should have been shown in a comparison form1
 
0.2%
Other values (66)66
 
13.2%
(Missing)423
84.8%

Length

2022-11-21T12:15:03.628369image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the123
 
8.0%
to48
 
3.1%
i38
 
2.5%
would37
 
2.4%
of29
 
1.9%
that28
 
1.8%
study24
 
1.6%
was23
 
1.5%
be23
 
1.5%
they22
 
1.4%
Other values (458)1134
74.2%

Most occurring characters

ValueCountFrequency (%)
1456
17.0%
e946
11.0%
t763
 
8.9%
a558
 
6.5%
o547
 
6.4%
i473
 
5.5%
s449
 
5.2%
n407
 
4.7%
r385
 
4.5%
h349
 
4.1%
Other values (31)2238
26.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7026
82.0%
Space Separator1456
 
17.0%
Other Punctuation60
 
0.7%
Decimal Number10
 
0.1%
Open Punctuation5
 
0.1%
Close Punctuation5
 
0.1%
Dash Punctuation4
 
< 0.1%
Currency Symbol3
 
< 0.1%
Uppercase Letter1
 
< 0.1%
Final Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e946
13.5%
t763
10.9%
a558
 
7.9%
o547
 
7.8%
i473
 
6.7%
s449
 
6.4%
n407
 
5.8%
r385
 
5.5%
h349
 
5.0%
l286
 
4.1%
Other values (15)1863
26.5%
Other Punctuation
ValueCountFrequency (%)
,34
56.7%
'20
33.3%
?4
 
6.7%
&1
 
1.7%
!1
 
1.7%
Decimal Number
ValueCountFrequency (%)
55
50.0%
83
30.0%
11
 
10.0%
01
 
10.0%
Space Separator
ValueCountFrequency (%)
1456
100.0%
Open Punctuation
ValueCountFrequency (%)
(5
100.0%
Close Punctuation
ValueCountFrequency (%)
)5
100.0%
Dash Punctuation
ValueCountFrequency (%)
-4
100.0%
Currency Symbol
ValueCountFrequency (%)
$3
100.0%
Uppercase Letter
ValueCountFrequency (%)
N1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin7027
82.0%
Common1544
 
18.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e946
13.5%
t763
10.9%
a558
 
7.9%
o547
 
7.8%
i473
 
6.7%
s449
 
6.4%
n407
 
5.8%
r385
 
5.5%
h349
 
5.0%
l286
 
4.1%
Other values (16)1864
26.5%
Common
ValueCountFrequency (%)
1456
94.3%
,34
 
2.2%
'20
 
1.3%
55
 
0.3%
(5
 
0.3%
)5
 
0.3%
-4
 
0.3%
?4
 
0.3%
83
 
0.2%
$3
 
0.2%
Other values (5)5
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII8570
> 99.9%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1456
17.0%
e946
11.0%
t763
 
8.9%
a558
 
6.5%
o547
 
6.4%
i473
 
5.5%
s449
 
5.2%
n407
 
4.7%
r385
 
4.5%
h349
 
4.1%
Other values (30)2237
26.1%
Punctuation
ValueCountFrequency (%)
1
100.0%

study_4_ethic_acc
Categorical

HIGH CORRELATION
MISSING

Distinct3
Distinct (%)0.9%
Missing180
Missing (%)36.1%
Memory size839.0 B
Somewhat acceptable
121 
Somewhat unacceptable
110 
Neutral
88 

Length

Max length21
Median length19
Mean length16.37931034
Min length7

Characters and Unicode

Total characters5225
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNeutral
2nd rowNeutral
3rd rowSomewhat unacceptable
4th rowSomewhat unacceptable
5th rowSomewhat acceptable

Common Values

ValueCountFrequency (%)
Somewhat acceptable121
24.2%
Somewhat unacceptable110
22.0%
Neutral88
17.6%
(Missing)180
36.1%

Length

2022-11-21T12:15:03.688940image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:03.743741image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
somewhat231
42.0%
acceptable121
22.0%
unacceptable110
20.0%
neutral88
 
16.0%

Most occurring characters

ValueCountFrequency (%)
e781
14.9%
a781
14.9%
t550
10.5%
c462
 
8.8%
l319
 
6.1%
S231
 
4.4%
b231
 
4.4%
p231
 
4.4%
231
 
4.4%
o231
 
4.4%
Other values (7)1177
22.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4675
89.5%
Uppercase Letter319
 
6.1%
Space Separator231
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e781
16.7%
a781
16.7%
t550
11.8%
c462
9.9%
l319
6.8%
b231
 
4.9%
p231
 
4.9%
o231
 
4.9%
h231
 
4.9%
w231
 
4.9%
Other values (4)627
13.4%
Uppercase Letter
ValueCountFrequency (%)
S231
72.4%
N88
 
27.6%
Space Separator
ValueCountFrequency (%)
231
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4994
95.6%
Common231
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e781
15.6%
a781
15.6%
t550
11.0%
c462
9.3%
l319
 
6.4%
S231
 
4.6%
b231
 
4.6%
p231
 
4.6%
o231
 
4.6%
h231
 
4.6%
Other values (6)946
18.9%
Common
ValueCountFrequency (%)
231
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5225
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e781
14.9%
a781
14.9%
t550
10.5%
c462
 
8.8%
l319
 
6.1%
S231
 
4.4%
b231
 
4.4%
p231
 
4.4%
231
 
4.4%
o231
 
4.4%
Other values (7)1177
22.5%

study_4_conc
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct253
Distinct (%)100.0%
Missing246
Missing (%)49.3%
Memory size4.0 KiB
same, the researchers created fake accounts that looked like people this is fraudulent
 
1
i don't agree with them hiding behind bots to complete their research and not telling the end user this was for research purposes upfront
 
1
the idea that one is not informed that their replies are being used is not acceptable to me, because information in many twitter accounts can successfully identify the person i use twitter a lot, only for sports but i do see a lot of different topics, many political that are posted the amount of bots on twitter is astounding to me, and i would never consider using it for data processing just my opinion
 
1
i'm not a big fan of participants being unaware, but as long as they remained anonymous i'm neutral on this one
 
1
i object to most studies in which users are not informed that they are being studied and that they are being manipulatedsecondly, i would like to see the type of reply that was originally sent this study almost directly contradicts the results reached in the last study in which people deleted their hate speech because they got a link to a fact checking site along with an empathetic responselastly, i now do my own fact checking after learning that quite a few of these fact checkers are deliberately manipulating and distorting info because of their own bias i simply no longer trust the "fact checkers"
 
1
Other values (248)
248 

Length

Max length815
Median length202
Mean length140.0079051
Min length4

Characters and Unicode

Total characters35422
Distinct characters53
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique253 ?
Unique (%)100.0%

Sample

1st rowi am uncertain how i feel completely about a researcher creating a fake account however i do understand the desire to protect themselves and to not give away their actions as being part of a study this misinformation needed to be corrected for the public but it opened the original poster to toxicity the op may not have known it was incorrect
2nd rowusers were not aware of what was going on so they were possibly more honest in their opinions because they had no idea they were being analyzed
3rd rowmany of the people that have large political followings on twitter (and many who don't) often know already the news they are sharing is fake it's political partisanship and the spreading of propaganda some might post fake news only to gain more followers (the masses) if they believe it serves that end
4th rowthis study is unethfull disclosure of intent of researchical because they were not informed of the research study
5th rowi have the same objections as before, it’s deceitful and lazy and twitter is a non representative sample of the public

Common Values

ValueCountFrequency (%)
same, the researchers created fake accounts that looked like people this is fraudulent1
 
0.2%
i don't agree with them hiding behind bots to complete their research and not telling the end user this was for research purposes upfront1
 
0.2%
the idea that one is not informed that their replies are being used is not acceptable to me, because information in many twitter accounts can successfully identify the person i use twitter a lot, only for sports but i do see a lot of different topics, many political that are posted the amount of bots on twitter is astounding to me, and i would never consider using it for data processing just my opinion1
 
0.2%
i'm not a big fan of participants being unaware, but as long as they remained anonymous i'm neutral on this one1
 
0.2%
i object to most studies in which users are not informed that they are being studied and that they are being manipulatedsecondly, i would like to see the type of reply that was originally sent this study almost directly contradicts the results reached in the last study in which people deleted their hate speech because they got a link to a fact checking site along with an empathetic responselastly, i now do my own fact checking after learning that quite a few of these fact checkers are deliberately manipulating and distorting info because of their own bias i simply no longer trust the "fact checkers"1
 
0.2%
since this was done in a public setting, i feel like it is more ethically acceptable than the other study that was similar, but done through private messages1
 
0.2%
i personally have no issue with this study, though objectively it seems a little dubious to study individuals without their knowledge like this1
 
0.2%
this is all public, so i have no issue with the researchers observing this1
 
0.2%
i feel like this is acceptable because when you sign up for social media, if you are posting something publicly it is assumed that anyone can look at these posts and reply to them1
 
0.2%
i have found that some "fact-checking" sites end up having wrong information as well a great case is how the idea that masks don't work spread many studies have been conducted and some very reputable scientific organizations have studies on their websites that say masks do not work, but the data is typically very small or the type of mask used was a very thin cloth mask, but it gives fuel to people who spread the false information that masking does not work1
 
0.2%
Other values (243)243
48.7%
(Missing)246
49.3%

Length

2022-11-21T12:15:03.813388image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the302
 
4.9%
to152
 
2.5%
i152
 
2.5%
of147
 
2.4%
that147
 
2.4%
they137
 
2.2%
a135
 
2.2%
is120
 
1.9%
and111
 
1.8%
not110
 
1.8%
Other values (1090)4690
75.6%

Most occurring characters

ValueCountFrequency (%)
5976
16.9%
e3602
10.2%
t3227
 
9.1%
a2341
 
6.6%
i2331
 
6.6%
o2035
 
5.7%
n2025
 
5.7%
s1865
 
5.3%
r1535
 
4.3%
h1493
 
4.2%
Other values (43)8992
25.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter28986
81.8%
Space Separator5976
 
16.9%
Other Punctuation356
 
1.0%
Dash Punctuation34
 
0.1%
Decimal Number34
 
0.1%
Open Punctuation10
 
< 0.1%
Close Punctuation9
 
< 0.1%
Final Punctuation6
 
< 0.1%
Math Symbol5
 
< 0.1%
Connector Punctuation4
 
< 0.1%
Other values (2)2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e3602
12.4%
t3227
11.1%
a2341
 
8.1%
i2331
 
8.0%
o2035
 
7.0%
n2025
 
7.0%
s1865
 
6.4%
r1535
 
5.3%
h1493
 
5.2%
d1012
 
3.5%
Other values (16)7520
25.9%
Other Punctuation
ValueCountFrequency (%)
,176
49.4%
'105
29.5%
"46
 
12.9%
?13
 
3.7%
%9
 
2.5%
&3
 
0.8%
:2
 
0.6%
!1
 
0.3%
;1
 
0.3%
Decimal Number
ValueCountFrequency (%)
213
38.2%
012
35.3%
72
 
5.9%
32
 
5.9%
52
 
5.9%
11
 
2.9%
41
 
2.9%
91
 
2.9%
Math Symbol
ValueCountFrequency (%)
=4
80.0%
~1
 
20.0%
Space Separator
ValueCountFrequency (%)
5976
100.0%
Dash Punctuation
ValueCountFrequency (%)
-34
100.0%
Open Punctuation
ValueCountFrequency (%)
(10
100.0%
Close Punctuation
ValueCountFrequency (%)
)9
100.0%
Final Punctuation
ValueCountFrequency (%)
6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_4
100.0%
Uppercase Letter
ValueCountFrequency (%)
N1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin28987
81.8%
Common6435
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e3602
12.4%
t3227
11.1%
a2341
 
8.1%
i2331
 
8.0%
o2035
 
7.0%
n2025
 
7.0%
s1865
 
6.4%
r1535
 
5.3%
h1493
 
5.2%
d1012
 
3.5%
Other values (17)7521
25.9%
Common
ValueCountFrequency (%)
5976
92.9%
,176
 
2.7%
'105
 
1.6%
"46
 
0.7%
-34
 
0.5%
?13
 
0.2%
213
 
0.2%
012
 
0.2%
(10
 
0.2%
)9
 
0.1%
Other values (16)41
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII35415
> 99.9%
Punctuation7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5976
16.9%
e3602
10.2%
t3227
 
9.1%
a2341
 
6.6%
i2331
 
6.6%
o2035
 
5.7%
n2025
 
5.7%
s1865
 
5.3%
r1535
 
4.3%
h1493
 
4.2%
Other values (41)8985
25.4%
Punctuation
ValueCountFrequency (%)
6
85.7%
1
 
14.3%

study_4_add_info
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING
UNIFORM

Distinct87
Distinct (%)100.0%
Missing412
Missing (%)82.6%
Memory size4.0 KiB
the researchers had a purpose in seeing the responses of those interacting with the post i do not agree with how it was done entirely however i do not know a better way to get the results that were desired
 
1
i would not think it is ethical no matter what
 
1
the researchers relied on "fact checkers" to determine if the information was "fake" or not if there existed a bias or margin of error in the fact checker's process then the researchers would be working with wrong information themselves
 
1
i find it acceptable because this method is helping to slow the spread of misinformation
 
1
the outcome of this study has also resulted in negative behavior by the participants due to the experiment
 
1
Other values (82)
82 

Length

Max length577
Median length118
Mean length113.908046
Min length32

Characters and Unicode

Total characters9910
Distinct characters37
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)100.0%

Sample

1st rowthe researchers had a purpose in seeing the responses of those interacting with the post i do not agree with how it was done entirely however i do not know a better way to get the results that were desired
2nd rowfull disclosure of intent of researchers
3rd rowplease see comments from the first study
4th rowif they had told people they would be part of an experiment
5th rowi don't use twitter too many bots

Common Values

ValueCountFrequency (%)
the researchers had a purpose in seeing the responses of those interacting with the post i do not agree with how it was done entirely however i do not know a better way to get the results that were desired1
 
0.2%
i would not think it is ethical no matter what1
 
0.2%
the researchers relied on "fact checkers" to determine if the information was "fake" or not if there existed a bias or margin of error in the fact checker's process then the researchers would be working with wrong information themselves1
 
0.2%
i find it acceptable because this method is helping to slow the spread of misinformation1
 
0.2%
the outcome of this study has also resulted in negative behavior by the participants due to the experiment1
 
0.2%
completely unethical customers were unaware of the study and fake accounts were made no one was compensated or aware1
 
0.2%
i would like to know example content of the tweets and be shown an example of the bot accounts - it's hard to know exactly how individuals would react to someone pointing out their wrong without knowing the profile of the person pointing out the error1
 
0.2%
researchers should not be spreading fake news in the name of research1
 
0.2%
fake news is a plague but there's got to be better ways to get rid of it deception1
 
0.2%
i don't think it's acceptable because non human bots were used and even though they linked to a fact checking website, they are still influencing people and that will cause more divisiveness1
 
0.2%
Other values (77)77
 
15.4%
(Missing)412
82.6%

Length

2022-11-21T12:15:03.890562image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the98
 
5.5%
to59
 
3.3%
i41
 
2.3%
of37
 
2.1%
it36
 
2.0%
a33
 
1.8%
if32
 
1.8%
they31
 
1.7%
and29
 
1.6%
that29
 
1.6%
Other values (528)1360
76.2%

Most occurring characters

ValueCountFrequency (%)
1703
17.2%
e1088
11.0%
t910
 
9.2%
a606
 
6.1%
o572
 
5.8%
i571
 
5.8%
s515
 
5.2%
n483
 
4.9%
h441
 
4.5%
r438
 
4.4%
Other values (27)2583
26.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8109
81.8%
Space Separator1703
 
17.2%
Other Punctuation76
 
0.8%
Dash Punctuation10
 
0.1%
Open Punctuation4
 
< 0.1%
Close Punctuation4
 
< 0.1%
Decimal Number4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e1088
13.4%
t910
11.2%
a606
 
7.5%
o572
 
7.1%
i571
 
7.0%
s515
 
6.4%
n483
 
6.0%
h441
 
5.4%
r438
 
5.4%
l309
 
3.8%
Other values (16)2176
26.8%
Other Punctuation
ValueCountFrequency (%)
,29
38.2%
'25
32.9%
"14
18.4%
?6
 
7.9%
;2
 
2.6%
Decimal Number
ValueCountFrequency (%)
03
75.0%
21
 
25.0%
Space Separator
ValueCountFrequency (%)
1703
100.0%
Dash Punctuation
ValueCountFrequency (%)
-10
100.0%
Open Punctuation
ValueCountFrequency (%)
(4
100.0%
Close Punctuation
ValueCountFrequency (%)
)4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8109
81.8%
Common1801
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e1088
13.4%
t910
11.2%
a606
 
7.5%
o572
 
7.1%
i571
 
7.0%
s515
 
6.4%
n483
 
6.0%
h441
 
5.4%
r438
 
5.4%
l309
 
3.8%
Other values (16)2176
26.8%
Common
ValueCountFrequency (%)
1703
94.6%
,29
 
1.6%
'25
 
1.4%
"14
 
0.8%
-10
 
0.6%
?6
 
0.3%
(4
 
0.2%
)4
 
0.2%
03
 
0.2%
;2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII9910
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1703
17.2%
e1088
11.0%
t910
 
9.2%
a606
 
6.1%
o572
 
5.8%
i571
 
5.8%
s515
 
5.2%
n483
 
4.9%
h441
 
4.5%
r438
 
4.4%
Other values (27)2583
26.1%

design_cont
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Very important
165 
Moderately important
124 
Extremely important
105 
Slightly important
67 
Not at all important
38 

Length

Max length20
Median length19
Mean length17.53707415
Min length14

Characters and Unicode

Total characters8751
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowExtremely important
4th rowModerately important
5th rowExtremely important

Common Values

ValueCountFrequency (%)
Very important165
33.1%
Moderately important124
24.8%
Extremely important105
21.0%
Slightly important67
13.4%
Not at all important38
 
7.6%

Length

2022-11-21T12:15:03.950119image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.006362image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
46.5%
very165
 
15.4%
moderately124
 
11.5%
extremely105
 
9.8%
slightly67
 
6.2%
not38
 
3.5%
at38
 
3.5%
all38
 
3.5%

Most occurring characters

ValueCountFrequency (%)
t1370
15.7%
r893
10.2%
a699
 
8.0%
o661
 
7.6%
e623
 
7.1%
m604
 
6.9%
575
 
6.6%
i566
 
6.5%
p499
 
5.7%
n499
 
5.7%
Other values (11)1762
20.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7677
87.7%
Space Separator575
 
6.6%
Uppercase Letter499
 
5.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1370
17.8%
r893
11.6%
a699
9.1%
o661
8.6%
e623
8.1%
m604
7.9%
i566
7.4%
p499
 
6.5%
n499
 
6.5%
y461
 
6.0%
Other values (5)802
10.4%
Uppercase Letter
ValueCountFrequency (%)
V165
33.1%
M124
24.8%
E105
21.0%
S67
13.4%
N38
 
7.6%
Space Separator
ValueCountFrequency (%)
575
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8176
93.4%
Common575
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1370
16.8%
r893
10.9%
a699
8.5%
o661
8.1%
e623
7.6%
m604
7.4%
i566
6.9%
p499
 
6.1%
n499
 
6.1%
y461
 
5.6%
Other values (10)1301
15.9%
Common
ValueCountFrequency (%)
575
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8751
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1370
15.7%
r893
10.2%
a699
 
8.0%
o661
 
7.6%
e623
 
7.1%
m604
 
6.9%
575
 
6.6%
i566
 
6.5%
p499
 
5.7%
n499
 
5.7%
Other values (11)1762
20.1%

design_num_users
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Not at all important
125 
Very important
106 
Slightly important
95 
Moderately important
89 
Extremely important
84 

Length

Max length20
Median length19
Mean length18.17635271
Min length14

Characters and Unicode

Total characters9070
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowVery important
4th rowModerately important
5th rowNot at all important

Common Values

ValueCountFrequency (%)
Not at all important125
25.1%
Very important106
21.2%
Slightly important95
19.0%
Moderately important89
17.8%
Extremely important84
16.8%

Length

2022-11-21T12:15:04.059948image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.116876image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
40.0%
not125
 
10.0%
at125
 
10.0%
all125
 
10.0%
very106
 
8.5%
slightly95
 
7.6%
moderately89
 
7.1%
extremely84
 
6.7%

Most occurring characters

ValueCountFrequency (%)
t1516
16.7%
a838
9.2%
r778
8.6%
749
8.3%
o713
7.9%
l613
 
6.8%
i594
 
6.5%
m583
 
6.4%
n499
 
5.5%
p499
 
5.5%
Other values (11)1688
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7822
86.2%
Space Separator749
 
8.3%
Uppercase Letter499
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1516
19.4%
a838
10.7%
r778
9.9%
o713
9.1%
l613
7.8%
i594
 
7.6%
m583
 
7.5%
n499
 
6.4%
p499
 
6.4%
e452
 
5.8%
Other values (5)737
9.4%
Uppercase Letter
ValueCountFrequency (%)
N125
25.1%
V106
21.2%
S95
19.0%
M89
17.8%
E84
16.8%
Space Separator
ValueCountFrequency (%)
749
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8321
91.7%
Common749
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1516
18.2%
a838
10.1%
r778
9.3%
o713
8.6%
l613
7.4%
i594
 
7.1%
m583
 
7.0%
n499
 
6.0%
p499
 
6.0%
e452
 
5.4%
Other values (10)1236
14.9%
Common
ValueCountFrequency (%)
749
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII9070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1516
16.7%
a838
9.2%
r778
8.6%
749
8.3%
o713
7.9%
l613
 
6.8%
i594
 
6.5%
m583
 
6.4%
n499
 
5.5%
p499
 
5.5%
Other values (11)1688
18.6%

design_res_purp
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Very important
136 
Extremely important
121 
Moderately important
109 
Slightly important
68 
Not at all important
65 

Length

Max length20
Median length19
Mean length17.8496994
Min length14

Characters and Unicode

Total characters8907
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowVery important
4th rowExtremely important
5th rowNot at all important

Common Values

ValueCountFrequency (%)
Very important136
27.3%
Extremely important121
24.2%
Moderately important109
21.8%
Slightly important68
13.6%
Not at all important65
13.0%

Length

2022-11-21T12:15:04.171012image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.227768image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
44.2%
very136
 
12.1%
extremely121
 
10.7%
moderately109
 
9.7%
slightly68
 
6.0%
not65
 
5.8%
at65
 
5.8%
all65
 
5.8%

Most occurring characters

ValueCountFrequency (%)
t1426
16.0%
r865
9.7%
a738
8.3%
o673
 
7.6%
629
 
7.1%
m620
 
7.0%
e596
 
6.7%
i567
 
6.4%
p499
 
5.6%
n499
 
5.6%
Other values (11)1795
20.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7779
87.3%
Space Separator629
 
7.1%
Uppercase Letter499
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1426
18.3%
r865
11.1%
a738
9.5%
o673
8.7%
m620
8.0%
e596
7.7%
i567
 
7.3%
p499
 
6.4%
n499
 
6.4%
l496
 
6.4%
Other values (5)800
10.3%
Uppercase Letter
ValueCountFrequency (%)
V136
27.3%
E121
24.2%
M109
21.8%
S68
13.6%
N65
13.0%
Space Separator
ValueCountFrequency (%)
629
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8278
92.9%
Common629
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1426
17.2%
r865
10.4%
a738
8.9%
o673
8.1%
m620
7.5%
e596
7.2%
i567
 
6.8%
p499
 
6.0%
n499
 
6.0%
l496
 
6.0%
Other values (10)1299
15.7%
Common
ValueCountFrequency (%)
629
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8907
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1426
16.0%
r865
9.7%
a738
8.3%
o673
 
7.6%
629
 
7.1%
m620
 
7.0%
e596
 
6.7%
i567
 
6.4%
p499
 
5.6%
n499
 
5.6%
Other values (11)1795
20.2%

design_len_data
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Very important
120 
Moderately important
119 
Slightly important
107 
Not at all important
97 
Extremely important
56 

Length

Max length20
Median length19
Mean length18.01603206
Min length14

Characters and Unicode

Total characters8990
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowExtremely important
4th rowVery important
5th rowNot at all important

Common Values

ValueCountFrequency (%)
Very important120
24.0%
Moderately important119
23.8%
Slightly important107
21.4%
Not at all important97
19.4%
Extremely important56
11.2%

Length

2022-11-21T12:15:04.282407image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.342489image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
41.9%
very120
 
10.1%
moderately119
 
10.0%
slightly107
 
9.0%
not97
 
8.1%
at97
 
8.1%
all97
 
8.1%
extremely56
 
4.7%

Most occurring characters

ValueCountFrequency (%)
t1474
16.4%
a812
9.0%
r794
8.8%
o715
8.0%
693
 
7.7%
i606
 
6.7%
l583
 
6.5%
m555
 
6.2%
p499
 
5.6%
n499
 
5.6%
Other values (11)1760
19.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7798
86.7%
Space Separator693
 
7.7%
Uppercase Letter499
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1474
18.9%
a812
10.4%
r794
10.2%
o715
9.2%
i606
7.8%
l583
 
7.5%
m555
 
7.1%
p499
 
6.4%
n499
 
6.4%
e470
 
6.0%
Other values (5)791
10.1%
Uppercase Letter
ValueCountFrequency (%)
V120
24.0%
M119
23.8%
S107
21.4%
N97
19.4%
E56
11.2%
Space Separator
ValueCountFrequency (%)
693
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8297
92.3%
Common693
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1474
17.8%
a812
9.8%
r794
9.6%
o715
8.6%
i606
7.3%
l583
 
7.0%
m555
 
6.7%
p499
 
6.0%
n499
 
6.0%
e470
 
5.7%
Other values (10)1290
15.5%
Common
ValueCountFrequency (%)
693
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8990
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1474
16.4%
a812
9.0%
r794
8.8%
o715
8.0%
693
 
7.7%
i606
 
6.7%
l583
 
6.5%
m555
 
6.2%
p499
 
5.6%
n499
 
5.6%
Other values (11)1760
19.6%

design_admin_inter
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Moderately important
139 
Very important
109 
Slightly important
96 
Not at all important
91 
Extremely important
64 

Length

Max length20
Median length19
Mean length18.17635271
Min length14

Characters and Unicode

Total characters9070
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowModerately important
4th rowModerately important
5th rowNot at all important

Common Values

ValueCountFrequency (%)
Moderately important139
27.9%
Very important109
21.8%
Slightly important96
19.2%
Not at all important91
18.2%
Extremely important64
12.8%

Length

2022-11-21T12:15:04.399217image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.457608image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
42.3%
moderately139
 
11.8%
very109
 
9.2%
slightly96
 
8.1%
not91
 
7.7%
at91
 
7.7%
all91
 
7.7%
extremely64
 
5.4%

Most occurring characters

ValueCountFrequency (%)
t1479
16.3%
a820
9.0%
r811
8.9%
o729
8.0%
681
 
7.5%
i595
 
6.6%
l577
 
6.4%
m563
 
6.2%
e515
 
5.7%
p499
 
5.5%
Other values (11)1801
19.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7890
87.0%
Space Separator681
 
7.5%
Uppercase Letter499
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1479
18.7%
a820
10.4%
r811
10.3%
o729
9.2%
i595
7.5%
l577
 
7.3%
m563
 
7.1%
e515
 
6.5%
p499
 
6.3%
n499
 
6.3%
Other values (5)803
10.2%
Uppercase Letter
ValueCountFrequency (%)
M139
27.9%
V109
21.8%
S96
19.2%
N91
18.2%
E64
12.8%
Space Separator
ValueCountFrequency (%)
681
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8389
92.5%
Common681
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1479
17.6%
a820
9.8%
r811
9.7%
o729
8.7%
i595
7.1%
l577
 
6.9%
m563
 
6.7%
e515
 
6.1%
p499
 
5.9%
n499
 
5.9%
Other values (10)1302
15.5%
Common
ValueCountFrequency (%)
681
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII9070
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1479
16.3%
a820
9.0%
r811
8.9%
o729
8.0%
681
 
7.5%
i595
 
6.6%
l577
 
6.4%
m563
 
6.2%
e515
 
5.7%
p499
 
5.5%
Other values (11)1801
19.9%

design_inter_type
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Very important
162 
Moderately important
145 
Extremely important
94 
Slightly important
65 
Not at all important
33 

Length

Max length20
Median length19
Mean length17.60320641
Min length14

Characters and Unicode

Total characters8784
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSlightly important
2nd rowNot at all important
3rd rowVery important
4th rowVery important
5th rowExtremely important

Common Values

ValueCountFrequency (%)
Very important162
32.5%
Moderately important145
29.1%
Extremely important94
18.8%
Slightly important65
13.0%
Not at all important33
 
6.6%

Length

2022-11-21T12:15:04.511660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.567376image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
46.9%
very162
 
15.2%
moderately145
 
13.6%
extremely94
 
8.8%
slightly65
 
6.1%
not33
 
3.1%
at33
 
3.1%
all33
 
3.1%

Most occurring characters

ValueCountFrequency (%)
t1368
15.6%
r900
10.2%
a710
8.1%
o677
 
7.7%
e640
 
7.3%
m593
 
6.8%
565
 
6.4%
i564
 
6.4%
p499
 
5.7%
n499
 
5.7%
Other values (11)1769
20.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7720
87.9%
Space Separator565
 
6.4%
Uppercase Letter499
 
5.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1368
17.7%
r900
11.7%
a710
9.2%
o677
8.8%
e640
8.3%
m593
7.7%
i564
7.3%
p499
 
6.5%
n499
 
6.5%
y466
 
6.0%
Other values (5)804
10.4%
Uppercase Letter
ValueCountFrequency (%)
V162
32.5%
M145
29.1%
E94
18.8%
S65
13.0%
N33
 
6.6%
Space Separator
ValueCountFrequency (%)
565
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8219
93.6%
Common565
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1368
16.6%
r900
11.0%
a710
8.6%
o677
8.2%
e640
7.8%
m593
7.2%
i564
6.9%
p499
 
6.1%
n499
 
6.1%
y466
 
5.7%
Other values (10)1303
15.9%
Common
ValueCountFrequency (%)
565
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8784
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1368
15.6%
r900
10.2%
a710
8.1%
o677
 
7.7%
e640
 
7.3%
m593
 
6.8%
565
 
6.4%
i564
 
6.4%
p499
 
5.7%
n499
 
5.7%
Other values (11)1769
20.1%

design_partic_aware
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Extremely important
191 
Very important
132 
Moderately important
92 
Slightly important
53 
Not at all important
31 

Length

Max length20
Median length19
Mean length17.81763527
Min length14

Characters and Unicode

Total characters8891
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSlightly important
2nd rowModerately important
3rd rowModerately important
4th rowExtremely important
5th rowSlightly important

Common Values

ValueCountFrequency (%)
Extremely important191
38.3%
Very important132
26.5%
Moderately important92
18.4%
Slightly important53
 
10.6%
Not at all important31
 
6.2%

Length

2022-11-21T12:15:04.621967image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.678846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
47.1%
extremely191
 
18.0%
very132
 
12.5%
moderately92
 
8.7%
slightly53
 
5.0%
not31
 
2.9%
at31
 
2.9%
all31
 
2.9%

Most occurring characters

ValueCountFrequency (%)
t1396
15.7%
r914
10.3%
e698
 
7.9%
m690
 
7.8%
a653
 
7.3%
o622
 
7.0%
561
 
6.3%
i552
 
6.2%
p499
 
5.6%
n499
 
5.6%
Other values (11)1807
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7831
88.1%
Space Separator561
 
6.3%
Uppercase Letter499
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1396
17.8%
r914
11.7%
e698
8.9%
m690
8.8%
a653
8.3%
o622
7.9%
i552
 
7.0%
p499
 
6.4%
n499
 
6.4%
y468
 
6.0%
Other values (5)840
10.7%
Uppercase Letter
ValueCountFrequency (%)
E191
38.3%
V132
26.5%
M92
18.4%
S53
 
10.6%
N31
 
6.2%
Space Separator
ValueCountFrequency (%)
561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8330
93.7%
Common561
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1396
16.8%
r914
11.0%
e698
8.4%
m690
8.3%
a653
7.8%
o622
7.5%
i552
 
6.6%
p499
 
6.0%
n499
 
6.0%
y468
 
5.6%
Other values (10)1339
16.1%
Common
ValueCountFrequency (%)
561
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1396
15.7%
r914
10.3%
e698
 
7.9%
m690
 
7.8%
a653
 
7.3%
o622
 
7.0%
561
 
6.3%
i552
 
6.2%
p499
 
5.6%
n499
 
5.6%
Other values (11)1807
20.3%

design_inter_impact
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Very important
139 
Moderately important
114 
Extremely important
107 
Slightly important
80 
Not at all important
59 

Length

Max length20
Median length19
Mean length17.79358717
Min length14

Characters and Unicode

Total characters8879
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowExtremely important
4th rowVery important
5th rowNot at all important

Common Values

ValueCountFrequency (%)
Very important139
27.9%
Moderately important114
22.8%
Extremely important107
21.4%
Slightly important80
16.0%
Not at all important59
11.8%

Length

2022-11-21T12:15:04.732181image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.788920image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
44.7%
very139
 
12.5%
moderately114
 
10.2%
extremely107
 
9.6%
slightly80
 
7.2%
not59
 
5.3%
at59
 
5.3%
all59
 
5.3%

Most occurring characters

ValueCountFrequency (%)
t1417
16.0%
r859
9.7%
a731
8.2%
o672
 
7.6%
617
 
6.9%
m606
 
6.8%
e581
 
6.5%
i579
 
6.5%
l499
 
5.6%
p499
 
5.6%
Other values (11)1819
20.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7763
87.4%
Space Separator617
 
6.9%
Uppercase Letter499
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1417
18.3%
r859
11.1%
a731
9.4%
o672
8.7%
m606
7.8%
e581
7.5%
i579
7.5%
l499
 
6.4%
p499
 
6.4%
n499
 
6.4%
Other values (5)821
10.6%
Uppercase Letter
ValueCountFrequency (%)
V139
27.9%
M114
22.8%
E107
21.4%
S80
16.0%
N59
11.8%
Space Separator
ValueCountFrequency (%)
617
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8262
93.1%
Common617
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1417
17.2%
r859
10.4%
a731
8.8%
o672
8.1%
m606
7.3%
e581
7.0%
i579
7.0%
l499
 
6.0%
p499
 
6.0%
n499
 
6.0%
Other values (10)1320
16.0%
Common
ValueCountFrequency (%)
617
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8879
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1417
16.0%
r859
9.7%
a731
8.2%
o672
 
7.6%
617
 
6.9%
m606
 
6.8%
e581
 
6.5%
i579
 
6.5%
l499
 
5.6%
p499
 
5.6%
Other values (11)1819
20.5%

design_type_data
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size839.0 B
Very important
144 
Moderately important
130 
Extremely important
106 
Slightly important
70 
Not at all important
49 

Length

Max length20
Median length19
Mean length17.7755511
Min length14

Characters and Unicode

Total characters8870
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNot at all important
2nd rowNot at all important
3rd rowNot at all important
4th rowVery important
5th rowExtremely important

Common Values

ValueCountFrequency (%)
Very important144
28.9%
Moderately important130
26.1%
Extremely important106
21.2%
Slightly important70
14.0%
Not at all important49
 
9.8%

Length

2022-11-21T12:15:04.842575image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:04.899186image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
important499
45.5%
very144
 
13.1%
moderately130
 
11.9%
extremely106
 
9.7%
slightly70
 
6.4%
not49
 
4.5%
at49
 
4.5%
all49
 
4.5%

Most occurring characters

ValueCountFrequency (%)
t1402
15.8%
r879
9.9%
a727
8.2%
o678
 
7.6%
e616
 
6.9%
m605
 
6.8%
597
 
6.7%
i569
 
6.4%
p499
 
5.6%
n499
 
5.6%
Other values (11)1799
20.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter7774
87.6%
Space Separator597
 
6.7%
Uppercase Letter499
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1402
18.0%
r879
11.3%
a727
9.4%
o678
8.7%
e616
7.9%
m605
7.8%
i569
7.3%
p499
 
6.4%
n499
 
6.4%
l474
 
6.1%
Other values (5)826
10.6%
Uppercase Letter
ValueCountFrequency (%)
V144
28.9%
M130
26.1%
E106
21.2%
S70
14.0%
N49
 
9.8%
Space Separator
ValueCountFrequency (%)
597
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin8273
93.3%
Common597
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t1402
16.9%
r879
10.6%
a727
8.8%
o678
8.2%
e616
7.4%
m605
7.3%
i569
6.9%
p499
 
6.0%
n499
 
6.0%
l474
 
5.7%
Other values (10)1325
16.0%
Common
ValueCountFrequency (%)
597
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8870
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t1402
15.8%
r879
9.9%
a727
8.2%
o678
 
7.6%
e616
 
6.9%
m605
 
6.8%
597
 
6.7%
i569
 
6.4%
p499
 
5.6%
n499
 
5.6%
Other values (11)1799
20.3%

design_add_fac
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct192
Distinct (%)96.0%
Missing299
Missing (%)59.9%
Memory size4.0 KiB
None
 
9
whether or not the research is going to be publicly available
 
1
no i think that covered them all
 
1
awareness by the user, extremely important
 
1
all the responses were raw and that's how the most accurate data is collected
 
1
Other values (187)
187 

Length

Max length652
Median length179.5
Mean length125.945
Min length4

Characters and Unicode

Total characters25189
Distinct characters43
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique191 ?
Unique (%)95.5%

Sample

1st rowthe only aspects of social media research that would cause concern for me is saving photographs or imaging data
2nd rownone that i can think of, other than what has been asked already
3rd rowreducing any type of hate is always a good thing
4th rowna i have already voiced my concerns about researching this in general from this surveys other questions
5th rowthe possibility of bot accounts spreading misinformation or hate speech just for the purpose of an experiment

Common Values

ValueCountFrequency (%)
None9
 
1.8%
whether or not the research is going to be publicly available1
 
0.2%
no i think that covered them all1
 
0.2%
awareness by the user, extremely important1
 
0.2%
all the responses were raw and that's how the most accurate data is collected1
 
0.2%
engagement in the study just like this survey, are they participants paying attention1
 
0.2%
there are no other aspects that i can think of1
 
0.2%
this type of research is only going to give you data on a small demographic as longas the researchers know this1
 
0.2%
i cannot think of any other aspects of research conducted on social media that are important to me in determining levels of concern1
 
0.2%
once again my focus, based on my experience on social media along with my interest in doing studiessurveys on prolific, is that users be informed if there is an intent to try to change or manipulate someone's online behaviorstudying posts objectively is one thing, but deliberately trying to get someone to act differently is unacceptable that is what advertisers, political operatives and propagandists do it is may also be why i have seen user activity drop off during this current campaign season too many accounts are being flagged for supposed violations when people have only posted honest questions regarding the facts or truth behind some claim1
 
0.2%
Other values (182)182
36.5%
(Missing)299
59.9%

Length

2022-11-21T12:15:04.967310image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the200
 
4.5%
of129
 
2.9%
is128
 
2.9%
to121
 
2.7%
i113
 
2.6%
a81
 
1.8%
that80
 
1.8%
and79
 
1.8%
it64
 
1.4%
are62
 
1.4%
Other values (1009)3369
76.1%

Most occurring characters

ValueCountFrequency (%)
4253
16.9%
e2532
10.1%
t2029
 
8.1%
a1715
 
6.8%
i1692
 
6.7%
o1566
 
6.2%
n1479
 
5.9%
s1342
 
5.3%
r1224
 
4.9%
h1051
 
4.2%
Other values (33)6306
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter20659
82.0%
Space Separator4253
 
16.9%
Other Punctuation229
 
0.9%
Dash Punctuation19
 
0.1%
Uppercase Letter9
 
< 0.1%
Open Punctuation6
 
< 0.1%
Close Punctuation5
 
< 0.1%
Final Punctuation4
 
< 0.1%
Decimal Number4
 
< 0.1%
Math Symbol1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e2532
12.3%
t2029
 
9.8%
a1715
 
8.3%
i1692
 
8.2%
o1566
 
7.6%
n1479
 
7.2%
s1342
 
6.5%
r1224
 
5.9%
h1051
 
5.1%
l794
 
3.8%
Other values (16)5235
25.3%
Other Punctuation
ValueCountFrequency (%)
,125
54.6%
'78
34.1%
?14
 
6.1%
"10
 
4.4%
1
 
0.4%
:1
 
0.4%
Decimal Number
ValueCountFrequency (%)
21
25.0%
41
25.0%
11
25.0%
31
25.0%
Space Separator
ValueCountFrequency (%)
4253
100.0%
Dash Punctuation
ValueCountFrequency (%)
-19
100.0%
Uppercase Letter
ValueCountFrequency (%)
N9
100.0%
Open Punctuation
ValueCountFrequency (%)
(6
100.0%
Close Punctuation
ValueCountFrequency (%)
)5
100.0%
Final Punctuation
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
+1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin20668
82.1%
Common4521
 
17.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e2532
12.3%
t2029
 
9.8%
a1715
 
8.3%
i1692
 
8.2%
o1566
 
7.6%
n1479
 
7.2%
s1342
 
6.5%
r1224
 
5.9%
h1051
 
5.1%
l794
 
3.8%
Other values (17)5244
25.4%
Common
ValueCountFrequency (%)
4253
94.1%
,125
 
2.8%
'78
 
1.7%
-19
 
0.4%
?14
 
0.3%
"10
 
0.2%
(6
 
0.1%
)5
 
0.1%
4
 
0.1%
1
 
< 0.1%
Other values (6)6
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII25184
> 99.9%
Punctuation5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4253
16.9%
e2532
10.1%
t2029
 
8.1%
a1715
 
6.8%
i1692
 
6.7%
o1566
 
6.2%
n1479
 
5.9%
s1342
 
5.3%
r1224
 
4.9%
h1051
 
4.2%
Other values (31)6301
25.0%
Punctuation
ValueCountFrequency (%)
4
80.0%
1
 
20.0%

rank_sci_repro
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.521042084
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.019379image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median6
Q37
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.679386636
Coefficient of variation (CV)0.3041792855
Kurtosis0.1087548301
Mean5.521042084
Median Absolute Deviation (MAD)1
Skewness-1.025007039
Sum2755
Variance2.820339474
MonotonicityNot monotonic
2022-11-21T12:15:05.066213image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
7203
40.7%
6102
20.4%
570
 
14.0%
449
 
9.8%
342
 
8.4%
217
 
3.4%
116
 
3.2%
ValueCountFrequency (%)
116
 
3.2%
217
 
3.4%
342
 
8.4%
449
 
9.8%
570
 
14.0%
6102
20.4%
7203
40.7%
ValueCountFrequency (%)
7203
40.7%
6102
20.4%
570
 
14.0%
449
 
9.8%
342
 
8.4%
217
 
3.4%
116
 
3.2%

rank_resp
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.527054108
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.111145image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.057596743
Coefficient of variation (CV)0.5833754402
Kurtosis-1.187605078
Mean3.527054108
Median Absolute Deviation (MAD)2
Skewness0.3541560179
Sum1760
Variance4.233704357
MonotonicityNot monotonic
2022-11-21T12:15:05.155885image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1106
21.2%
293
18.6%
371
14.2%
470
14.0%
762
12.4%
656
11.2%
541
 
8.2%
ValueCountFrequency (%)
1106
21.2%
293
18.6%
371
14.2%
470
14.0%
541
 
8.2%
656
11.2%
762
12.4%
ValueCountFrequency (%)
762
12.4%
656
11.2%
541
 
8.2%
470
14.0%
371
14.2%
293
18.6%
1106
21.2%

rank_just
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.651302605
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.199261image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median5
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.671474984
Coefficient of variation (CV)0.3593563192
Kurtosis-0.5866613384
Mean4.651302605
Median Absolute Deviation (MAD)1
Skewness-0.4219703321
Sum2321
Variance2.793828621
MonotonicityNot monotonic
2022-11-21T12:15:05.244595image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5115
23.0%
696
19.2%
491
18.2%
775
15.0%
363
12.6%
233
 
6.6%
126
 
5.2%
ValueCountFrequency (%)
126
 
5.2%
233
 
6.6%
363
12.6%
491
18.2%
5115
23.0%
696
19.2%
775
15.0%
ValueCountFrequency (%)
775
15.0%
696
19.2%
5115
23.0%
491
18.2%
363
12.6%
233
 
6.6%
126
 
5.2%

rank_anony
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.054108216
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.287805image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.587816994
Coefficient of variation (CV)0.5198954594
Kurtosis-0.4222403
Mean3.054108216
Median Absolute Deviation (MAD)1
Skewness0.6240587048
Sum1524
Variance2.521162808
MonotonicityNot monotonic
2022-11-21T12:15:05.335773image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2141
28.3%
3105
21.0%
181
16.2%
474
14.8%
551
 
10.2%
634
 
6.8%
713
 
2.6%
ValueCountFrequency (%)
181
16.2%
2141
28.3%
3105
21.0%
474
14.8%
551
 
10.2%
634
 
6.8%
713
 
2.6%
ValueCountFrequency (%)
713
 
2.6%
634
 
6.8%
551
 
10.2%
474
14.8%
3105
21.0%
2141
28.3%
181
16.2%

rank_harms
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.69739479
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.379944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile6
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.808784129
Coefficient of variation (CV)0.6705670732
Kurtosis-0.4651366385
Mean2.69739479
Median Absolute Deviation (MAD)1
Skewness0.8169441724
Sum1346
Variance3.271700027
MonotonicityNot monotonic
2022-11-21T12:15:05.426850image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1189
37.9%
291
18.2%
366
 
13.2%
458
 
11.6%
545
 
9.0%
630
 
6.0%
720
 
4.0%
ValueCountFrequency (%)
1189
37.9%
291
18.2%
366
 
13.2%
458
 
11.6%
545
 
9.0%
630
 
6.0%
720
 
4.0%
ValueCountFrequency (%)
720
 
4.0%
630
 
6.0%
545
 
9.0%
458
 
11.6%
366
 
13.2%
291
18.2%
1189
37.9%

rank_balance
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.975951904
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.471056image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median5
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.616749319
Coefficient of variation (CV)0.3249125695
Kurtosis-0.4443151664
Mean4.975951904
Median Absolute Deviation (MAD)1
Skewness-0.5992454593
Sum2483
Variance2.613878359
MonotonicityNot monotonic
2022-11-21T12:15:05.516581image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
6122
24.4%
5111
22.2%
797
19.4%
468
13.6%
358
11.6%
228
 
5.6%
115
 
3.0%
ValueCountFrequency (%)
115
 
3.0%
228
 
5.6%
358
11.6%
468
13.6%
5111
22.2%
6122
24.4%
797
19.4%
ValueCountFrequency (%)
797
19.4%
6122
24.4%
5111
22.2%
468
13.6%
358
11.6%
228
 
5.6%
115
 
3.0%

rank_pub_interst
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.573146293
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.560215image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.760841438
Coefficient of variation (CV)0.4927985854
Kurtosis-0.9434350009
Mean3.573146293
Median Absolute Deviation (MAD)1
Skewness0.2559463962
Sum1783
Variance3.100562571
MonotonicityNot monotonic
2022-11-21T12:15:05.605334image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
296
19.2%
394
18.8%
489
17.8%
166
13.2%
566
13.2%
659
11.8%
729
 
5.8%
ValueCountFrequency (%)
166
13.2%
296
19.2%
394
18.8%
489
17.8%
566
13.2%
659
11.8%
729
 
5.8%
ValueCountFrequency (%)
729
 
5.8%
659
11.8%
566
13.2%
489
17.8%
394
18.8%
296
19.2%
166
13.2%

rank_add_fac_1
Categorical

HIGH CARDINALITY
HIGH CORRELATION
MISSING

Distinct67
Distinct (%)82.7%
Missing418
Missing (%)83.8%
Memory size4.0 KiB
None
15 
carrying out the study in a way that it is not going to be skewed based upon what the user believes the researcher is looking for
 
1
honesty when disseminating the information
 
1
inform participants of study after
 
1
understanding the limits of the internet
 
1
Other values (62)
62 

Length

Max length336
Median length163
Mean length79.72839506
Min length4

Characters and Unicode

Total characters6458
Distinct characters39
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)81.5%

Sample

1st rowa full disclosure of any political organizations of which a researcher belongs to or has donated to within a previous time frame (such as 4 yrs)
2nd rowthe researchers should not intrude into the user's personal lives
3rd rowfull disclosure of intent of research
4th rowNone
5th rowdon't intentionally mislead unknowing participants sending them links to a fake fact checking site would be an example 4

Common Values

ValueCountFrequency (%)
None15
 
3.0%
carrying out the study in a way that it is not going to be skewed based upon what the user believes the researcher is looking for1
 
0.2%
honesty when disseminating the information1
 
0.2%
inform participants of study after1
 
0.2%
understanding the limits of the internet1
 
0.2%
the social media service (twittet, facebook, etc) knows the data is being collected1
 
0.2%
explain where they get the info on people to study did they pay for it?1
 
0.2%
methods used to interact with participants whether awareunaware and the expected knowledge of social platforms for modern usersie - using automated bot accounts will likely skew experiment results due to intelligent users identifying bots versus real humans1
 
0.2%
usefulness - is the study or experiment actually useful in the sense that its results or process is beneficial to those in the study or to people outside of it, aside from the researchers?1
 
0.2%
i think the above list covers everything if there are more things to consider, they're not coming to mind for me, and i've sat here thinking for awhile (timer is down to 12 minutes left, and i have no idea how many more pages are left in this study, so i should probably stop typing here and carry on so i don't run out of time overall)1
 
0.2%
Other values (57)57
 
11.4%
(Missing)418
83.8%

Length

2022-11-21T12:15:05.675633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the60
 
5.6%
to33
 
3.1%
of31
 
2.9%
be25
 
2.3%
and21
 
2.0%
for19
 
1.8%
is18
 
1.7%
that17
 
1.6%
study17
 
1.6%
or17
 
1.6%
Other values (441)809
75.8%

Most occurring characters

ValueCountFrequency (%)
990
15.3%
e625
 
9.7%
t555
 
8.6%
a443
 
6.9%
i437
 
6.8%
o421
 
6.5%
n381
 
5.9%
s353
 
5.5%
r345
 
5.3%
h231
 
3.6%
Other values (29)1677
26.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5345
82.8%
Space Separator990
 
15.3%
Other Punctuation66
 
1.0%
Uppercase Letter15
 
0.2%
Close Punctuation11
 
0.2%
Open Punctuation11
 
0.2%
Dash Punctuation10
 
0.2%
Decimal Number9
 
0.1%
Final Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e625
11.7%
t555
10.4%
a443
 
8.3%
i437
 
8.2%
o421
 
7.9%
n381
 
7.1%
s353
 
6.6%
r345
 
6.5%
h231
 
4.3%
d204
 
3.8%
Other values (16)1350
25.3%
Other Punctuation
ValueCountFrequency (%)
,36
54.5%
'18
27.3%
"8
 
12.1%
?4
 
6.1%
Decimal Number
ValueCountFrequency (%)
24
44.4%
43
33.3%
12
22.2%
Space Separator
ValueCountFrequency (%)
990
100.0%
Uppercase Letter
ValueCountFrequency (%)
N15
100.0%
Close Punctuation
ValueCountFrequency (%)
)11
100.0%
Open Punctuation
ValueCountFrequency (%)
(11
100.0%
Dash Punctuation
ValueCountFrequency (%)
-10
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5360
83.0%
Common1098
 
17.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e625
11.7%
t555
10.4%
a443
 
8.3%
i437
 
8.2%
o421
 
7.9%
n381
 
7.1%
s353
 
6.6%
r345
 
6.4%
h231
 
4.3%
d204
 
3.8%
Other values (17)1365
25.5%
Common
ValueCountFrequency (%)
990
90.2%
,36
 
3.3%
'18
 
1.6%
)11
 
1.0%
(11
 
1.0%
-10
 
0.9%
"8
 
0.7%
?4
 
0.4%
24
 
0.4%
43
 
0.3%
Other values (2)3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII6457
> 99.9%
Punctuation1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
990
15.3%
e625
 
9.7%
t555
 
8.6%
a443
 
6.9%
i437
 
6.8%
o421
 
6.5%
n381
 
5.9%
s353
 
5.5%
r345
 
5.3%
h231
 
3.6%
Other values (28)1676
26.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

rank_add_fac_1_pos
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct11
Distinct (%)7.4%
Missing350
Missing (%)70.1%
Infinite0
Infinite (%)0.0%
Mean4.429530201
Minimum0
Maximum10
Zeros12
Zeros (%)2.4%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.732922image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median4
Q38
95-th percentile8.6
Maximum10
Range10
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.323249455
Coefficient of variation (CV)0.7502487406
Kurtosis-1.605372616
Mean4.429530201
Median Absolute Deviation (MAD)3
Skewness0.1127040338
Sum660
Variance11.04398694
MonotonicityNot monotonic
2022-11-21T12:15:05.780405image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
844
 
8.8%
142
 
8.4%
012
 
2.4%
312
 
2.4%
410
 
2.0%
107
 
1.4%
66
 
1.2%
76
 
1.2%
55
 
1.0%
24
 
0.8%
(Missing)350
70.1%
ValueCountFrequency (%)
012
 
2.4%
142
8.4%
24
 
0.8%
312
 
2.4%
410
 
2.0%
55
 
1.0%
66
 
1.2%
76
 
1.2%
844
8.8%
91
 
0.2%
ValueCountFrequency (%)
107
 
1.4%
91
 
0.2%
844
8.8%
76
 
1.2%
66
 
1.2%
55
 
1.0%
410
 
2.0%
312
 
2.4%
24
 
0.8%
142
8.4%

rank_add_fac_2
Categorical

HIGH CORRELATION
MISSING

Distinct19
Distinct (%)73.1%
Missing473
Missing (%)94.8%
Memory size4.0 KiB
None
interact as a researcher it will carry more weight if people know who is suggesting or informing and this does matter
 
1
minimize threats to social media users mental health by the study manipulations
 
1
must be for positive societal movement, not backwards movement
 
1
researchers must maintain participant confidentiality
 
1
Other values (14)
14 

Length

Max length132
Median length79
Mean length40.30769231
Min length4

Characters and Unicode

Total characters1048
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)69.2%

Sample

1st rowNone
2nd rowabiding by the laws of research
3rd rowNone
4th rowensure payments alongside bonus
5th rowprivacy that their identity will not be known

Common Values

ValueCountFrequency (%)
None8
 
1.6%
interact as a researcher it will carry more weight if people know who is suggesting or informing and this does matter1
 
0.2%
minimize threats to social media users mental health by the study manipulations1
 
0.2%
must be for positive societal movement, not backwards movement1
 
0.2%
researchers must maintain participant confidentiality1
 
0.2%
at the conclusion, the user should have the option to have their data dismissed1
 
0.2%
laws keep lawsuits to a minimal1
 
0.2%
information about who the researchers are1
 
0.2%
they should be required to state which political party they make financial contributions to , and how much money they give each year1
 
0.2%
be mindful of manipulation that could skewer the results you desire1
 
0.2%
Other values (9)9
 
1.8%
(Missing)473
94.8%

Length

2022-11-21T12:15:05.837397image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the11
 
6.3%
none8
 
4.6%
to6
 
3.4%
be4
 
2.3%
and3
 
1.7%
of3
 
1.7%
they3
 
1.7%
study3
 
1.7%
are3
 
1.7%
movement2
 
1.1%
Other values (112)128
73.6%

Most occurring characters

ValueCountFrequency (%)
148
14.1%
e110
10.5%
t84
 
8.0%
a75
 
7.2%
i74
 
7.1%
s63
 
6.0%
o63
 
6.0%
n60
 
5.7%
r55
 
5.2%
h44
 
4.2%
Other values (18)272
26.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter889
84.8%
Space Separator148
 
14.1%
Uppercase Letter8
 
0.8%
Other Punctuation3
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e110
12.4%
t84
 
9.4%
a75
 
8.4%
i74
 
8.3%
s63
 
7.1%
o63
 
7.1%
n60
 
6.7%
r55
 
6.2%
h44
 
4.9%
l30
 
3.4%
Other values (15)231
26.0%
Space Separator
ValueCountFrequency (%)
148
100.0%
Uppercase Letter
ValueCountFrequency (%)
N8
100.0%
Other Punctuation
ValueCountFrequency (%)
,3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin897
85.6%
Common151
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e110
12.3%
t84
 
9.4%
a75
 
8.4%
i74
 
8.2%
s63
 
7.0%
o63
 
7.0%
n60
 
6.7%
r55
 
6.1%
h44
 
4.9%
l30
 
3.3%
Other values (16)239
26.6%
Common
ValueCountFrequency (%)
148
98.0%
,3
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
148
14.1%
e110
10.5%
t84
 
8.0%
a75
 
7.2%
i74
 
7.1%
s63
 
6.0%
o63
 
6.0%
n60
 
5.7%
r55
 
5.2%
h44
 
4.2%
Other values (18)272
26.0%

rank_add_fac_2_pos
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct10
Distinct (%)11.5%
Missing412
Missing (%)82.6%
Infinite0
Infinite (%)0.0%
Mean5.172413793
Minimum0
Maximum10
Zeros11
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:05.888771image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q39
95-th percentile10
Maximum10
Range10
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.84044237
Coefficient of variation (CV)0.7424855248
Kurtosis-1.808645945
Mean5.172413793
Median Absolute Deviation (MAD)4
Skewness-0.05284867363
Sum450
Variance14.74899759
MonotonicityNot monotonic
2022-11-21T12:15:05.937905image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
929
 
5.8%
220
 
4.0%
011
 
2.2%
18
 
1.6%
107
 
1.4%
54
 
0.8%
84
 
0.8%
32
 
0.4%
71
 
0.2%
61
 
0.2%
(Missing)412
82.6%
ValueCountFrequency (%)
011
 
2.2%
18
 
1.6%
220
4.0%
32
 
0.4%
54
 
0.8%
61
 
0.2%
71
 
0.2%
84
 
0.8%
929
5.8%
107
 
1.4%
ValueCountFrequency (%)
107
 
1.4%
929
5.8%
84
 
0.8%
71
 
0.2%
61
 
0.2%
54
 
0.8%
32
 
0.4%
220
4.0%
18
 
1.6%
011
 
2.2%

rank_add_fac_3
Categorical

HIGH CORRELATION
MISSING

Distinct19
Distinct (%)86.4%
Missing477
Missing (%)95.6%
Memory size4.0 KiB
None
ethical application in the real world
 
1
cost vs benefits for study participants will they be compensated in some way?
 
1
inform participants about the study's outcomes after the study has been concluded and analyzed
 
1
length of study and data collection
 
1
Other values (14)
14 

Length

Max length94
Median length63.5
Mean length43.36363636
Min length4

Characters and Unicode

Total characters954
Distinct characters33
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)81.8%

Sample

1st rowgive results after entire experiment is done
2nd rowconfidentiality of participants information
3rd rowNone
4th rowcollection of data that is personal
5th rowpromise to offer access to study results ifwhen available

Common Values

ValueCountFrequency (%)
None4
 
0.8%
ethical application in the real world1
 
0.2%
cost vs benefits for study participants will they be compensated in some way?1
 
0.2%
inform participants about the study's outcomes after the study has been concluded and analyzed1
 
0.2%
length of study and data collection1
 
0.2%
all unknowing participants notified of use of their info1
 
0.2%
research misconduct, such as:fabrication, falsification, and plagiarism1
 
0.2%
size of the study how many people are they collecting data on1
 
0.2%
the user should be given the results of the study when it is over1
 
0.2%
adherence to regulations (like gdpr)1
 
0.2%
Other values (9)9
 
1.8%
(Missing)477
95.6%

Length

2022-11-21T12:15:05.992020image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the11
 
7.1%
of8
 
5.2%
study6
 
3.9%
none4
 
2.6%
is4
 
2.6%
participants4
 
2.6%
and4
 
2.6%
to4
 
2.6%
data3
 
1.9%
are3
 
1.9%
Other values (91)103
66.9%

Most occurring characters

ValueCountFrequency (%)
133
13.9%
e93
 
9.7%
t87
 
9.1%
i73
 
7.7%
n63
 
6.6%
a62
 
6.5%
o61
 
6.4%
s54
 
5.7%
r42
 
4.4%
l34
 
3.6%
Other values (23)252
26.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter808
84.7%
Space Separator133
 
13.9%
Other Punctuation6
 
0.6%
Uppercase Letter4
 
0.4%
Open Punctuation1
 
0.1%
Close Punctuation1
 
0.1%
Final Punctuation1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e93
11.5%
t87
10.8%
i73
 
9.0%
n63
 
7.8%
a62
 
7.7%
o61
 
7.5%
s54
 
6.7%
r42
 
5.2%
l34
 
4.2%
h33
 
4.1%
Other values (14)206
25.5%
Other Punctuation
ValueCountFrequency (%)
,3
50.0%
'1
 
16.7%
?1
 
16.7%
:1
 
16.7%
Space Separator
ValueCountFrequency (%)
133
100.0%
Uppercase Letter
ValueCountFrequency (%)
N4
100.0%
Open Punctuation
ValueCountFrequency (%)
(1
100.0%
Close Punctuation
ValueCountFrequency (%)
)1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin812
85.1%
Common142
 
14.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e93
11.5%
t87
10.7%
i73
 
9.0%
n63
 
7.8%
a62
 
7.6%
o61
 
7.5%
s54
 
6.7%
r42
 
5.2%
l34
 
4.2%
h33
 
4.1%
Other values (15)210
25.9%
Common
ValueCountFrequency (%)
133
93.7%
,3
 
2.1%
'1
 
0.7%
?1
 
0.7%
:1
 
0.7%
(1
 
0.7%
)1
 
0.7%
1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII953
99.9%
Punctuation1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
133
14.0%
e93
 
9.8%
t87
 
9.1%
i73
 
7.7%
n63
 
6.6%
a62
 
6.5%
o61
 
6.4%
s54
 
5.7%
r42
 
4.4%
l34
 
3.6%
Other values (22)251
26.3%
Punctuation
ValueCountFrequency (%)
1
100.0%

rank_add_fac_3_pos
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct9
Distinct (%)11.0%
Missing417
Missing (%)83.6%
Infinite0
Infinite (%)0.0%
Mean5.817073171
Minimum0
Maximum10
Zeros11
Zeros (%)2.2%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:06.039326image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median7
Q310
95-th percentile10
Maximum10
Range10
Interquartile range (IQR)8

Descriptive statistics

Standard deviation4.079845219
Coefficient of variation (CV)0.7013570397
Kurtosis-1.742930501
Mean5.817073171
Median Absolute Deviation (MAD)3
Skewness-0.181235257
Sum477
Variance16.64513701
MonotonicityNot monotonic
2022-11-21T12:15:06.088192image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1032
 
6.4%
315
 
3.0%
011
 
2.2%
17
 
1.4%
95
 
1.0%
84
 
0.8%
24
 
0.8%
62
 
0.4%
42
 
0.4%
(Missing)417
83.6%
ValueCountFrequency (%)
011
 
2.2%
17
 
1.4%
24
 
0.8%
315
3.0%
42
 
0.4%
62
 
0.4%
84
 
0.8%
95
 
1.0%
1032
6.4%
ValueCountFrequency (%)
1032
6.4%
95
 
1.0%
84
 
0.8%
62
 
0.4%
42
 
0.4%
315
3.0%
24
 
0.8%
17
 
1.4%
011
 
2.2%

aware_sm_advan_score
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct8
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.869739479
Minimum-3
Maximum4
Zeros47
Zeros (%)9.4%
Negative21
Negative (%)4.2%
Memory size4.0 KiB
2022-11-21T12:15:06.132782image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-3
5-th percentile0
Q11
median2
Q33
95-th percentile3
Maximum4
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.212827247
Coefficient of variation (CV)0.6486610894
Kurtosis0.112523089
Mean1.869739479
Median Absolute Deviation (MAD)1
Skewness-0.6708705122
Sum933
Variance1.470949932
MonotonicityNot monotonic
2022-11-21T12:15:06.182167image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
3167
33.5%
2141
28.3%
1106
21.2%
047
 
9.4%
-119
 
3.8%
417
 
3.4%
-21
 
0.2%
-31
 
0.2%
ValueCountFrequency (%)
-31
 
0.2%
-21
 
0.2%
-119
 
3.8%
047
 
9.4%
1106
21.2%
2141
28.3%
3167
33.5%
417
 
3.4%
ValueCountFrequency (%)
417
 
3.4%
3167
33.5%
2141
28.3%
1106
21.2%
047
 
9.4%
-119
 
3.8%
-21
 
0.2%
-31
 
0.2%

aware_sm_interact_score
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
156 
2
146 
0
117 
3
68 
-1
 
12

Length

Max length2
Median length1
Mean length1.024048096
Min length1

Characters and Unicode

Total characters511
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row2
4th row1
5th row3

Common Values

ValueCountFrequency (%)
1156
31.3%
2146
29.3%
0117
23.4%
368
13.6%
-112
 
2.4%

Length

2022-11-21T12:15:06.236513image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-21T12:15:06.290224image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1168
33.7%
2146
29.3%
0117
23.4%
368
13.6%

Most occurring characters

ValueCountFrequency (%)
1168
32.9%
2146
28.6%
0117
22.9%
368
13.3%
-12
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number499
97.7%
Dash Punctuation12
 
2.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1168
33.7%
2146
29.3%
0117
23.4%
368
13.6%
Dash Punctuation
ValueCountFrequency (%)
-12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common511
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1168
32.9%
2146
28.6%
0117
22.9%
368
13.3%
-12
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII511
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1168
32.9%
2146
28.6%
0117
22.9%
368
13.3%
-12
 
2.3%

aware_sm_use_score
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct10
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.569138277
Minimum0
Maximum9
Zeros5
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-11-21T12:15:06.334732image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q15
median7
Q39
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.368791708
Coefficient of variation (CV)0.3605939787
Kurtosis-0.4509960942
Mean6.569138277
Median Absolute Deviation (MAD)2
Skewness-0.6818309309
Sum3278
Variance5.611174156
MonotonicityNot monotonic
2022-11-21T12:15:06.382256image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
9167
33.5%
561
 
12.2%
759
 
11.8%
655
 
11.0%
852
 
10.4%
447
 
9.4%
331
 
6.2%
114
 
2.8%
28
 
1.6%
05
 
1.0%
ValueCountFrequency (%)
05
 
1.0%
114
 
2.8%
28
 
1.6%
331
 
6.2%
447
 
9.4%
561
 
12.2%
655
 
11.0%
759
 
11.8%
852
 
10.4%
9167
33.5%
ValueCountFrequency (%)
9167
33.5%
852
 
10.4%
759
 
11.8%
655
 
11.0%
561
 
12.2%
447
 
9.4%
331
 
6.2%
28
 
1.6%
114
 
2.8%
05
 
1.0%

Interactions

2022-11-21T12:14:59.686669image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:50.862235image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.552221image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.349443image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.997795image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.652910image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.295812image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.125178image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.773895image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.425553image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.080266image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.862183image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.440714image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.020739image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.732119image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:50.964606image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.602624image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.395853image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.045305image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.699153image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.342609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.171370image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.820143image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.472345image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.120903image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.901844image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.482129image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.068853image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.778848image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.011424image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.649958image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.443719image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.092797image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.746243image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.568731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.218223image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.867848image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.520944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.161510image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.944254image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.523010image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.118332image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.826178image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.058762image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.697240image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.491370image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.142045image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.793366image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.617581image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.265802image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.915730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.568808image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.203520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.985811image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.565012image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.168991image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.872495image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.105659image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.745217image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.538982image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.190302image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.841601image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.665381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.313766image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.963146image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.616608image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.245107image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.029157image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.606986image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.217738image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.918281image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.152697image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.942869image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.586448image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.239727image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.889545image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.713058image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.361072image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.011037image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.664696image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.288652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.071770image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.647859image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.266442image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.964475image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.198861image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.991726image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.634131image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.288513image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.937181image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.761206image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.408839image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.060942image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.714068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.330425image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.112860image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.688995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.316595image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.012305image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.245627image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.039384image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.683513image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.337020image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.984443image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.810727image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.455891image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.109177image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.764058image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.372380image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.154207image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.729673image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.364733image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.060127image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.292153image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.087221image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.731807image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.387522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.031698image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.860325image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.503471image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.156820image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.812459image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.415162image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.195942image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.771210image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.413739image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.108217image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.338756image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.135959image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.779990image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.435783image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.079058image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.908981image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.551549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.206516image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.859849image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.456880image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.237034image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.813107image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.462771image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.148541image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.379075image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.176611image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.821727image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.478048image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.120277image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.950099image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.593123image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.248854image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.901957image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.500175image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.276565image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.852655image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.504200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.193172image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.419011image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.216346image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.862548image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.518672image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.161060image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.990278image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.635570image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.289696image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.942793image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.740418image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.316530image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.893140image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.544660image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.236493image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.459180image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.255656image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.903283image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.558544image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.201605image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.030421image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.679312image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.330552image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.983610image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.780595image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.355967image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.933942image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.585509image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:15:00.285432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:51.507537image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.304286image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:52.952363image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:53.607586image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:54.250514image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.079502image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:55.728741image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:56.380200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.032773image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:57.822656image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.397189image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:58.975046image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-11-21T12:14:59.637490image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-11-21T12:15:06.457274image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-21T12:15:06.624200image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-21T12:15:06.747995image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-21T12:15:06.869644image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-21T12:15:07.329349image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-21T12:15:07.489806image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-21T12:15:00.423105image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-21T12:15:01.001832image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-21T12:15:01.285882image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-11-21T12:15:01.451776image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexsm_useagegender_idethnic_idedupolitic_viewsaware_sm_resaware_sm_advanaware_sm_interactaware_sm_useethic_apprstudy_1_ethic_accstudy_1_concstudy_1_add_infostudy_2_ethic_accstudy_2_concstudy_2_add_infostudy_3_ethic_accstudy_3_concstudy_3_add_infostudy_4_ethic_accstudy_4_concstudy_4_add_infodesign_contdesign_num_usersdesign_res_purpdesign_len_datadesign_admin_interdesign_inter_typedesign_partic_awaredesign_inter_impactdesign_type_datadesign_add_facrank_sci_reprorank_resprank_justrank_anonyrank_harmsrank_balancerank_pub_interstrank_add_fac_1rank_add_fac_1_posrank_add_fac_2rank_add_fac_2_posrank_add_fac_3rank_add_fac_3_posaware_sm_advan_scoreaware_sm_interact_scoreaware_sm_use_score
01Facebook29.0MaleAsian - EasternHighschoolSlightly liberalExtremely aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys)][Creating fake accounts ("bots"), Secretly changing the content of what users see][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]The scope of the project and actions there in do not cross certain boundaries that may purposefully negatively affect participants as well as legal regulations and standard practices.NeutralNaNNaNNeutralNaNNaNNeutralNaNNaNNeutralNaNNaNNot at all importantNot at all importantNot at all importantNot at all importantNot at all importantSlightly importantSlightly importantNot at all importantNot at all importantNaN2.07.05.06.04.03.01.0NaNNaNNaNNaNNaNNaN409
12Twitter33.0MaleMixed raceHighschoolNeutral/ Neither conservative or liberalModerately aware[… are large and can contain millions of data points][Privately messaging users, Publicly posting on users' profiles, Secretly changing the content of what users see][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]I think Ethical Approval means that the experiment is gathering data without harm or injury to people.NaNNaNNaNNaNNaNNaNNaNNaNNaNNeutralNaNNaNNot at all importantNot at all importantNot at all importantNot at all importantNot at all importantNot at all importantModerately importantNot at all importantNot at all importantthe only aspects of social media research that would cause concern for me is saving photographs or imaging data3.05.02.06.01.07.04.0NaNNaNNaNNaNNaNNaN119
23Facebook33.0FemalePacific IslanderBachelor's degreeVery liberalExtremely aware[… are large and can contain millions of data points, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles, Creating fake accounts ("bots"), Secretly changing the content of what users see][Political elections (e.g. voting behavior), Presidential approval ratings, Communication (e.g. spread of opinions and hate-speech), News consumption (e.g. sharing of misinformation), Social networks]Researchers focus on ethical standards towards those they gain data from. They need approval of their approach and receive methods.NaNno concerns i would have loved to partake in this study in terms of watching the resultsNaNNaNgoing to the poster privately provided opportunity for change without the possibly of increased toxicity from users i prefer this method over commenting the "correct information"NaNSomewhat acceptablei find this is ethical as long as participants were fully aware of what was being monitored the results are interesting! no concernsNaNSomewhat unacceptablei am uncertain how i feel completely about a researcher creating a fake account however i do understand the desire to protect themselves and to not give away their actions as being part of a study this misinformation needed to be corrected for the public but it opened the original poster to toxicity the op may not have known it was incorrectthe researchers had a purpose in seeing the responses of those interacting with the post i do not agree with how it was done entirely however i do not know a better way to get the results that were desiredExtremely importantVery importantVery importantExtremely importantModerately importantVery importantModerately importantExtremely importantNot at all importantnone that i can think of, other than what has been asked already7.05.06.03.02.04.01.0NaNNaNNaNNaNNaNNaN225
34Facebook73.0FemaleWhite / CaucasianHighschoolSlightly conservativeModerately aware[… are large and can contain millions of data points, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Presidential approval ratings, Health topics (e.g. spread of diseases), Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation)]I would think that using "ethical approval" means that the things others collect on social media sites would need to be honest and moral. Hopefully, there would be no under-handedness used in collecting information.Neutrali feel if people know they are being judged they will act, speak, or write differently than if they don't know they are being analyzedNaNSomewhat acceptablei feel as though, in the above case, users had a choice to respond or not so i think it was honestNaNSomewhat acceptableas long as the facebook users were informed that they would be in a study i feel it is fair it was up to the users whether they wanted to participate or not also, they were encouraged, but not actually made to like the facebook studyNaNSomewhat unacceptableusers were not aware of what was going on so they were possibly more honest in their opinions because they had no idea they were being analyzedNaNModerately importantModerately importantExtremely importantVery importantModerately importantVery importantExtremely importantVery importantVery importantreducing any type of hate is always a good thing7.02.06.03.04.05.01.0NaN8.0NaNNaNNaNNaN116
45Twitter27.0FemaleNative-AmericanHighschoolVery liberalExtremely aware[… often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles, Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]A set of rules of what to do and what to not do.NaNNaNNaNNaNNaNNaNNaNthe web extension being used was invasive, even if it was used with consent the people participating in the study are not educated enough on exactly how much information the web extension was takingmaking the source code for the web extension publicly available to have complete transparency over what the extension was doingNaNNaNNaNExtremely importantNot at all importantNot at all importantNot at all importantNot at all importantExtremely importantSlightly importantNot at all importantExtremely importantNaN3.01.05.02.04.06.07.0NaNNaNNaNNaNNaNNaN039
56Facebook49.0FemaleHispanicBachelor's degreeSlightly liberalSlightly aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Publicly posting on users' profiles, Creating fake accounts ("bots"), Hacking into users' accounts][Political elections (e.g. voting behavior), Health topics (e.g. spread of diseases), Well-being and economic satisfaction, News consumption (e.g. sharing of misinformation), Social networks]is when the participants have the right to know who was access to their data and what is being done with it.Somewhat acceptableNaNNaNSomewhat unacceptableNaNNaNNaNNaNNaNSomewhat acceptableNaNNaNVery importantModerately importantExtremely importantVery importantVery importantVery importantVery importantVery importantVery importantNaN7.02.06.04.01.03.05.0NaNNaNNaNNaNNaNNaN215
67Facebook53.0MaleWhite / CaucasianHighschoolSlightly conservativeSlightly aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect, … are unaffected by the way social media platforms work][None of the above][Political elections (e.g. voting behavior), Presidential approval ratings, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation)]Verification of some sort that social media users and/or the data being used is not being skewed to support a theory or the results in any way.Completey unacceptableeasy enough for an outside government to try copying such a study with the sole purpose of creating much more polarization, hate, etc not that it hasn't been tried and tested perhaps innumerable times by all types of foreign or domestic entities as far as we know no actual study would have really been needed to know that using a type of marketing manipulation could alter the recipients moodlevels of concernanxietyhateetcNaNSomewhat acceptableNaNconcerns over the possibility of the researchers having their own political agenda yet fake news is a major problem what social media really is when mass sharing news (political news), is simple propaganda from the left and rightNeutralthe researchers seem in some ways to try manipulating political viewpoints in a segment of the population for the sake of scienceNaNNeutralmany of the people that have large political followings on twitter (and many who don't) often know already the news they are sharing is fake it's political partisanship and the spreading of propaganda some might post fake news only to gain more followers (the masses) if they believe it serves that endNaNModerately importantExtremely importantSlightly importantVery importantModerately importantExtremely importantExtremely importantExtremely importantVery importantna i have already voiced my concerns about researching this in general from this surveys other questions6.07.05.03.02.04.01.0a full disclosure of any political organizations of which a researcher belongs to or has donated to within a previous time frame (such as 4 yrs)8.0NaN9.0NaN10.0205
78Reddit29.0FemaleWhite / CaucasianHighschoolSlightly liberalModerately aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles, Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Economic forecasting, Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation)]Going through a process of peer review maybe? Like earlier you mentioned creating bot accounts, so maybe making sure the researcher isn’t spreading hate or misinformationNaNNaNNaNNaNNaNNaNNaNNaNNaNSomewhat acceptableNaNNaNVery importantVery importantModerately importantExtremely importantModerately importantModerately importantModerately importantExtremely importantVery importantthe possibility of bot accounts spreading misinformation or hate speech just for the purpose of an experiment7.06.05.04.02.01.03.0NaNNaNNaNNaNNaNNaN336
89Facebook23.0MaleWhite / CaucasianBachelor's degreeNeutral/ Neither conservative or liberalModerately aware[… are unaffected by the way social media platforms work][Publicly posting on users' profiles][Social networks]Social media is a collective term for websites and applications that focus on communication, community-based input, interaction, content-sharing and collaboration.NeutralNaNNaNNeutralNaNNaNNeutralNaNNaNNeutralNaNNaNModerately importantModerately importantModerately importantModerately importantModerately importantModerately importantModerately importantModerately importantModerately importantno, i didn't anything like that5.04.07.02.06.03.01.0NaN6.0NaN5.0NaN8.0-111
910Facebook65.0MaleHispanicHighschoolVery liberalModerately aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]Whether or not something goes against someone's right to privacy online.NaNNaNi would be interested to know what kind of messages they sent the hate speech users that got them to change their mindsNaNit's perfectly within someone's right to send someone else a message on any platform, therefore i believe this study was acceptableNaNNaNpeople willingly consented to being part of the research study, so i believe the study was completely acceptableNaNNaNNaNNaNVery importantNot at all importantSlightly importantNot at all importantNot at all importantNot at all importantModerately importantVery importantNot at all importantNaN5.03.07.04.01.06.02.0the researchers should not intrude into the user's personal lives8.0NaNNaNNaNNaN328

Last rows

df_indexsm_useagegender_idethnic_idedupolitic_viewsaware_sm_resaware_sm_advanaware_sm_interactaware_sm_useethic_apprstudy_1_ethic_accstudy_1_concstudy_1_add_infostudy_2_ethic_accstudy_2_concstudy_2_add_infostudy_3_ethic_accstudy_3_concstudy_3_add_infostudy_4_ethic_accstudy_4_concstudy_4_add_infodesign_contdesign_num_usersdesign_res_purpdesign_len_datadesign_admin_interdesign_inter_typedesign_partic_awaredesign_inter_impactdesign_type_datadesign_add_facrank_sci_reprorank_resprank_justrank_anonyrank_harmsrank_balancerank_pub_interstrank_add_fac_1rank_add_fac_1_posrank_add_fac_2rank_add_fac_2_posrank_add_fac_3rank_add_fac_3_posaware_sm_advan_scoreaware_sm_interact_scoreaware_sm_use_score
489490Facebook37.0FemaleWhite / CaucasianMaster's degree or aboveSlightly liberalVery aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][None of the above][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]That the researchers would use the acquired users data in an ethical manner without manipulating it. Keeping the users data safe and secure.Somewhat acceptableaccounts were anonymous and operated by the human which is fine and the outcome was awesome so i would say that type of research is somewhat acceptableNaNSomewhat unacceptablecreating fake accounts and sending unsolicited private messages to users is unethicalNaNNaNthe researchers tried to bribe and manipulate the social media usersi do not approve this practice by the researchersSomewhat unacceptablecreating fake accounts for research or any other purposes is not acceptable or ethical to meNaNVery importantVery importantExtremely importantExtremely importantExtremely importantVery importantExtremely importantExtremely importantExtremely importantNone7.03.01.02.06.05.04.0NaNNaNNaNNaNNaNNaN209
490491Twitter44.0MaleAfrican-AmericanHighschoolSlightly liberalModerately aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles, Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]To be approved by the original source it came from.NaNthis should be done more often, it's a good thing to do, completely acceptablehate speech is a serious issue, we need to do betterSomewhat acceptablei think it's in their best concerns to reduce the amount of misinformation, and also help fact check what's postedit's acceptable on my behalf due to the researchers posting factsSomewhat acceptablei can relate to going to another news source to see what information they're giving, and doing this study in this type of way is intriguingNaNNeutrali'm not sure if this is good, or badNaNExtremely importantExtremely importantExtremely importantExtremely importantVery importantExtremely importantVery importantExtremely importantExtremely importantwho the researchers are targeting on social media sites race, sex, job type, political view2.04.05.07.03.01.06.0NaNNaNNaNNaNNaNNaN339
491492Facebook39.0FemaleWhite / CaucasianHighschoolSlightly conservativeExtremely aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Creating fake accounts ("bots"), Secretly changing the content of what users see][Political elections (e.g. voting behavior), Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation)]It means that possible risks have been considered and deemed acceptable.Completey unacceptableparticipants should have the right to accept or decline to participate in the studyNaNNaNparticipants should be made aware of the study and have the option to either accept or decline being included in itNaNNaNNaNNaNNaNthe participants should have been made aware that they were part of a study and either accept or decline taking part in itNaNVery importantVery importantVery importantExtremely importantModerately importantModerately importantExtremely importantVery importantVery importantNaN7.01.04.02.03.05.06.0NaNNaNNaNNaNNaNNaN317
492493Reddit54.0MaleWhite / CaucasianBachelor's degreeNeutral/ Neither conservative or liberalSlightly aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles, Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]Assurance that the experimenters will use the data and information collected only for the purpose explained in the study. Also, that the person being polled is aware of their rights and redresses, if necessary, by a board or body overseeing the researchers. Generally, that the experiment will cause no foreseeable harm to the people being polled.Neutralit's concerning that the study misrepresented the nature of the anonymous accounts who replied to the message it's understandable that they wanted sincere reactions to the messages they sent and that informing the recipients they weren't real people could have caused the messages to be disregarded or met with a level of denial, but since there were a range of responses to the original hate speech, i wonder if any of the replies were incendiary, which could cause the original user to get even more emotionally involved, stressed, or angry, which could lead to actual violence or emotional distress i'd imagine if they were trying to measure how people reacted to different messages they would have to have them grouped into at least empathetic, neutral, and contrary types of messages the researchers sentit's hard to judge without seeing the actual content of the messages, so i'd want to see that and who is overseeing the study and how closely it's being monitoredSomewhat acceptablesame as the others, that the subjects were unaware of the experiment i do find the anonymous accounts more acceptable than the human-looking automated accountsNaNNaNNaNit seems the study was forthcoming and transparent and that participants had to opt in to join it, so i can't see any issues, as long as all other processes are in place (eg the study is being overseen, etc)Somewhat acceptablewhile most of the study seems innocuous, for instance, the bot is just replying with a tweet about fact-checking that the user can choose not to click, it's always concerning when the subjects don't know they're part of an experiment and that the automated accounts were apparently made to look like a human useri'd want to make sure the study is only using publicly-made twitter statements and not going any further by looking into other social media sites the user might have linked or any other biographical information that could be discernedSlightly importantNot at all importantSlightly importantNot at all importantVery importantSlightly importantExtremely importantNot at all importantExtremely importantNaN7.02.05.06.01.03.04.0NaNNaNNaNNaNNaNNaN238
493494Facebook32.0MaleWhite / CaucasianMaster's degree or aboveSlightly conservativeVery aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]Ethical approval means getting approval from the University or the government or both.NaNNaNNaNNaNNaNsome people may not want to be contacted privatelyNaNchoosing between the results of their own data or money is completely unacceptable why should participants have to pay to view their own data? they created it, so they should have access to it if they want itNaNNaNNaNNaNVery importantVery importantVery importantSlightly importantVery importantExtremely importantVery importantExtremely importantExtremely importantNaN7.02.05.01.04.03.06.0NaNNaNNaNNaNNaNNaN319
494495Facebook35.0FemaleWhite / CaucasianBachelor's degreeNeutral/ Neither conservative or liberalModerately aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Creating fake accounts ("bots"), Secretly changing the content of what users see][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]Approval to do any type of thing that might be deceptive.NaNNaNNaNSomewhat unacceptableit seems a little too deceptive to meNaNNaNNaNNaNNaNNaNNaNModerately importantModerately importantVery importantVery importantVery importantVery importantExtremely importantVery importantVery importantNaN6.05.04.01.03.07.02.0NaNNaNNaNNaNNaNNaN209
495496Facebook39.0MaleWhite / CaucasianMaster's degree or aboveVery conservativeModerately aware[… reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect, … are always representative of people’s offline behavior, … are unaffected by the way social media platforms work][Publicly posting on users' profiles][Health topics (e.g. spread of diseases), Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]It has to with researchers taking a mental note of the standards meant to be followed while conducting research experiment.NaNNaNNaNSomewhat acceptableNaNNaNSomewhat acceptableNaNNaNNaNNaNNaNExtremely importantSlightly importantExtremely importantModerately importantSlightly importantExtremely importantExtremely importantSlightly importantVery importanti can't think of any other aspects7.03.02.04.01.05.06.0NaNNaNNaNNaNNaNNaN016
496497Facebook37.0FemaleAfrican-AmericanHighschoolVery liberalNot at all aware[… often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Communication (e.g. spread of opinions and hate-speech), News consumption (e.g. sharing of misinformation), Social networks]I think ethical approval means that institutions have to deem the experiments as tests that most would approve of.NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNot at all importantNot at all importantNot at all importantNot at all importantNot at all importantNot at all importantSlightly importantNot at all importantNot at all importantNaN7.04.06.03.01.02.05.0NaNNaNNaNNaNNaNNaN014
497498Reddit23.0FemaleAfrican-AmericanHighschoolSlightly liberalSlightly aware[… are large and can contain millions of data points, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles][Political elections (e.g. voting behavior), Economic forecasting, Presidential approval ratings, Well-being and economic satisfaction, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]I think ethical approval means that the experiment has to be deemed as appropriate, safe, and not have long-term consequences.Somewhat unacceptableit is unacceptable that the users were never made aware that it was a study and the researcher analyzed the user's behaviors for weeksNaNSomewhat unacceptableit is good that the researchers only examined data that was collected during the experiment period but unacceptable that the users were messaged privately and were not made aware that it was a experimentNaNSomewhat acceptableit is good that users were made aware that it was a study and what the users had to do was related to the research topicNaNNeutralit is good that the researchers only analyzed the users' behaviors for a short period after the experiment but it is not appropriate that the researchers never told the users that it was an experimentNaNSlightly importantNot at all importantNot at all importantExtremely importantNot at all importantModerately importantVery importantVery importantModerately importantNaN6.02.07.03.04.05.01.0long-term effects of the experiment4.0NaNNaNNaNNaN128
498499Twitter55.0MaleWhite / CaucasianVocational trainingVery liberalModerately aware[… are large and can contain millions of data points, … reflect events in real-time and can be collected continuously over time, … are naturalistic in that they do not require researchers to directly interact with research volunteers, … often capture social relationships not found using traditional methods (e.g. surveys), … are readily accessible to researchers and easy to collect][Privately messaging users, Publicly posting on users' profiles, Creating fake accounts ("bots")][Political elections (e.g. voting behavior), Presidential approval ratings, Communication (e.g. spread of opinions and hate-speech), Public sentiment (e.g. environment-related concerns), News consumption (e.g. sharing of misinformation), Social networks]I think ethical approval is that an academic experiment is run in an ethical way. Meaning that the researchers adhere to ethical standards.Somewhat unacceptablethe fact that participants were not aware they were part of a research study is a concernNaNSomewhat unacceptablei find it somewhat unacceptable that researchers sent unsolicited private messagesif the researchers had contacted the twitter users instead of sending unsolicited private messages i would probably find it a little more ethicalNaNNaNNaNSomewhat unacceptablethe users were not informed that they are part of an academic research and deceived by human looking automatic accountsNaNModerately importantNot at all importantModerately importantModerately importantModerately importantVery importantExtremely importantSlightly importantModerately importantNaN7.01.03.02.05.06.04.0NaNNaNNaNNaNNaNNaN336